|
|
|
|
|
by wybiral
2536 days ago
|
|
A while back I wrote a Python script to watch for links posted on Twitter and then scrape their /robots.txt file [1]. The requests are routed through Tor for privacy purposes. It's been incredibly enlightening. One thing that sticks out immediately is that you can identify the underlying HTTP framework in many cases due to the defaults. Sometimes even the exact version. And, yes, people do use the robots file to "protect" or "hide" endpoints and they can effectively be used to enumerate potential endpoints worth investigating further (from a pentesting perspective). [1] https://gist.github.com/wybiral/20c20ccf00b6c93506b8acdc6ccb... |
|