🤨Content Discovery

Content Discovery :

  • Three main ways of discovering content : Manually, Automated and OSINT (Open-Source Intelligence).

1] Manual Discovery :

  • Robots.txt

    The robots.txt file is a document that tells search engines which pages they are and aren't allowed to show on their search engine .

  • Favicon

    The favicon is a small icon displayed in the browser's address bar or tab used for branding a website. If the sirte has default Web frameworks favicon , we can get the details of the framework.

Inside the page source we wind favicon path.

// curl https://static-labs.tryhackme.cloud/sites/favicon/images/favicon.ico | md5sum

By following command we can find md5sum hash of favicon and then search it to https://wiki.owasp.org/index.php/OWASP_favicon_database. and khow the framework details

  • Sitemap.xml List of every file the website owner wishes to be listed on a search engine.

  • HTTP Headers

-->root@ip-10-10-249-10:~#
* Rebuilt URL to: http://10.10.72.82/
*   Trying 10.10.72.82...
* TCP_NODELAY set
* Connected to 10.10.72.82 (10.10.72.82) port 80 (#0)
> GET / HTTP/1.1
> Host: 10.10.72.82
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx/1.18.0 (Ubuntu)
< Date: Sat, 01 Jul 2023 04:50:03 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-FLAG: THM{HEADER_FLAG}
< 
  • Framework Stack

2] OSINT :

  • Google Hacking / Dorking

//site:tryhackme.com
returns results only from the specified website address

//inurl:admin
returns results that have the specified word in the URL

//filetype:pdf
returns results which are a particular file extension

//intitle:admin
returns results that contain the specified word in the title
  • Wappalyzer: an online tool and browser extension that helps identify what technologies a website uses.

  • Wayback Machine : is a historical archive of websites that dates back to the late 90s.

  • GitHub : version control system.

  • S3 Buckets : The format of the S3 buckets is http(s)://{name}.s3.amazonaws.com

3] Automated Discovery :

using tools to discover content rather than doing it manually

Automation Tools

Although there are many different content discovery tools available, all with their features and flaws, we're going to cover three which are preinstalled on our attack box, ffuf, dirb and gobuster.

// usinf ffuf
#ffuf -w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt -u http://10.10.72.82/FUZZ

//Using dirb:
#dirb http://10.10.72.82/ /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt

//Using Gobuster: ( i prefer )
#gobuster dir --url http://10.10.72.82/ -w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt

Last updated