|
|
|
|
|
by modalduality
3229 days ago
|
|
Good article (didn't realize there were other kinds of lookaround), but maybe the bottom should link to well-tested standards-based regexes instead. URL: ^(((http|https|ftp):\/\/)?([[a-zA-Z0-9]\-\.])+(\.)([[a-zA-Z0-9]]){2,4}([[a-zA-Z0-9]\/+=%&_\.~?\-]*))*$
I recently encountered a case where a URL had an underscore at the end of a subdomain name. It seems underscores are okay anywhere else, but while my friend on Windows was able to load the website, I wasn't (on Linux) using Firefox, curl, remote screenshot service which presumably ran Linux etc. According to various RFCs, they should be okay anywhere within the subdomain name.Has anyone encountered this behavior? Couldn't find anything on the internet; maybe it's just my computer? |
|
I'm not enough of a history boffin to know how Microsoft came to support it differently (perhaps something from the Netbios and NT era). At this point in time though, I don't see either party changing their default validations to agree on a single definition.
Edit: If you're curious, this is the first commit that appears to be the first glibc commit limiting dashes at the end of URLS https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=fa0bc.... I don't know about BSD libc, or windows however.