Hacker News new | ask | show | jobs
by jacksonsabey 3489 days ago
I have an implementation, although it's currently closed source and is only available via API: http://0ut.ca/documentation

I believe it's closest to the standard that I've found, and if it isn't I would like to correct that.

There is a Strict parser which will fail on any error, and Loose parser which will discard errors when possible and follow the defacto parsing implementations.

It should be able to handle any of the edge cases, such as partially percent encoded unicode, invalid characters, normalization, or octal/hex ipv4 addresses. The only thing from your linked unittests that it will not handle is | and \ for windows paths, they will be encoded.

You can easily compare the expected output in your browser here if anyone is interested in seeing how parsing is done: http://0ut.ca/api;v1.0/validate/uri/after?hTtPs://foo:%F0%9F... You can also try validating strange relative URIs: http://0ut.ca/api;v1.0/validate/uri/after?+invalid-scheme:/p...?

I would be happy to explain any of the reasoning behind the parsing if anyone is interested.

1 comments

Wow, thanks!

Your tool helps me because it's like an EXAMPLES section of a man page.