🤨Content Discovery

Content Discovery :

Three main ways of discovering content : Manually, Automated and OSINT (Open-Source Intelligence).

1] Manual Discovery :

Robots.txt
The robots.txt file is a document that tells search engines which pages they are and aren't allowed to show on their search engine .
Favicon
The favicon is a small icon displayed in the browser's address bar or tab used for branding a website. If the sirte has default Web frameworks favicon , we can get the details of the framework.

Inside the page source we wind favicon path.

// curl https://static-labs.tryhackme.cloud/sites/favicon/images/favicon.ico | md5sum

By following command we can find md5sum hash of favicon and then search it to https://wiki.owasp.org/index.php/OWASP_favicon_database. and khow the framework details

Sitemap.xml List of every file the website owner wishes to be listed on a search engine.

HTTP Headers

-->root@ip-10-10-249-10:~#
* Rebuilt URL to: http://10.10.72.82/
*   Trying 10.10.72.82...
* TCP_NODELAY set
* Connected to 10.10.72.82 (10.10.72.82) port 80 (#0)
> GET / HTTP/1.1
> Host: 10.10.72.82
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx/1.18.0 (Ubuntu)
< Date: Sat, 01 Jul 2023 04:50:03 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-FLAG: THM{HEADER_FLAG}
<

Framework Stack

2] OSINT :

Google Hacking / Dorking

//site:tryhackme.com
returns results only from the specified website address

//inurl:admin
returns results that have the specified word in the URL

//filetype:pdf
returns results which are a particular file extension

//intitle:admin
returns results that contain the specified word in the title

Wappalyzer: an online tool and browser extension that helps identify what technologies a website uses.
Wayback Machine : is a historical archive of websites that dates back to the late 90s.
GitHub : version control system.
S3 Buckets : The format of the S3 buckets is http(s)://{name}.s3.amazonaws.com

3] Automated Discovery :

using tools to discover content rather than doing it manually

Automation Tools

Although there are many different content discovery tools available, all with their features and flaws, we're going to cover three which are preinstalled on our attack box, ffuf, dirb and gobuster.

// usinf ffuf
#ffuf -w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt -u http://10.10.72.82/FUZZ

//Using dirb:
#dirb http://10.10.72.82/ /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt

//Using Gobuster: ( i prefer )
#gobuster dir --url http://10.10.72.82/ -w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt

PreviousWalking An Application NextSubdomain Enumeration

Last updated 2 years ago