Hacker News new | ask | show | jobs
by layer8 839 days ago
I wonder if there would be a way to tag such URLs in a machine-recognizable, but not text-searchable way. (E.g. take every fifth byte in the URL from after the authority part, and have those bytes be a particular form of hash of the remaining bytes.) Meaning that crawlers and tools in TFA would have a standardized way to recognize when a URL is meant to be private, and thus could filter them out from public searches. Of course, being recognizable in that way may add new risks.
3 comments

We already have a solution to this. It’s called not including authentication information within URLs

Even if search engines knew to include it, would every insecure place a user put a link know it? Bad actors with their own indexes certainly wouldn’t care

How do you implement password-reset links otherwise? I mean, those should be short-lived, but still.
You could send the user a code that they must copy paste onto the page rather than sending them a link.
Hopefully using POST not GET. The GET links get logged in the HTTP server most of time. Just another great way to store your 'security credential' in plain text. Logs gets zipped and archive. Good luck with any security measure.
I mean of course the idea was to put it in a form that is sent using POST, but even then, it's a single-use reset code so once it shows in the log it's worthless.
This makes a large assumption about application logic that is often incorrect.

t. security auditor/researcher.

As you said, short lived codes. And the codes don’t contain any PII. So even if the link does get indexed, it’s meaningless and useless.
A short-lived link that's locked down to their user agent/IP would work as well.
Actually, there are cases where this is more or less unavoidable.

For example, if you want a web socket server that is accessible from a browser, you need authentication, and can't rely on cookies, the only option is to encode the Auth information in the URL (since browsers don't allow custom headers in the initial HTTP request for negotiating a web socket).

Authentication: Identify yourself

Authorization: Can you use this service.

Access Control/Tokenization: How long can this service be used for.

I swipe my badge on the card reader. The lock unlocks.

Should we leave a handy door stopper or 2x4 there, so you can just leave it propped open? Or should we have tokens that expire in a reasonable time frame.. say a block of ice (in our door metaphor) so it disappears at some point in future? Nonce tokens have been a well understood pattern for a long time...

Its not that these things are unavoidable its that security isnt first principal, or easy to embed due to issues of design.

> Or should we have tokens that expire in a reasonable time frame.

And that are single-use.

(Your password reset "magic link" should expire quickly, but needs a long enough window to allow for slow mail transport. But once it's used the first time, it should be revoked so it cannot be used again even inside that timeout window.)

> the only option is to encode the Auth information in the URL (since browsers don't allow custom headers in the initial HTTP request for negotiating a web socket).

Put a timestamp in the token and sign it with a private key, so that the token expires after a defined time period.

If the URL is only valid for the next five minutes, the odds that the URL will leak and be exploited in that five minute window is very low

Also, it would allow bad actors to just opt out of malware scans - the main vector whereby these insecure URLs were leaked.
So there was an interesting vector a while back where some email firewalls would reliably click on any link sent to them that was abused by spammers.

Spammers would sign up for services that required a click on a link using blabla@domainusingsuchservice

The services bots to check phishing would reliably click on the link, rendering the account creation valid.

One particularly exploitable vendor for getting such links clicked was one that shares the name with a predatory fish that also has a song about it :)

SharkGate?

Why coy about naming them?

Barracuda. And for plausible deniability so they don’t have as much of a chance of catching a libel suit. Not sure how necessary or effective that is, but I do understand the motivation.
Yeah - that's just red-flagging "interesting" urls to people running greyhat and blackhat crawlers.
We already have robots.txt in theory.
I didn’t think robots.txt would be applicable to URLs being copied around, but actually it might be, good point. Though again, collecting that robots.txt information could make it easier to search for such URLs.