Apparently, it requires curl 7.81 to compile, but I'm on ubuntu 20 (7.6x) and wanted to do a quick try, so this patch makes it work (trivially) by removing the couple of new symbols depending on that newer version. Just a quick hack to compile.
Given what it does, and to avoid confusion with perl, "transform url" seems more reasonable. Though I probably would have gone with "turl" so it's easier to pronounce.
Though it hardly gets any use because s/// is so much more flexible, Perl also has a tr/// builtin that replicates the behavior of the command line tool.
Pardone my ignorance, serious question:
Why is this a big deal?
I'm probably underestimting the work needed, but it doesn't look like a hard thing to write.
What am I missing?
Over the years, curl itself has had 9 CVEs relating to handling URLs [0] so this is most definitely not a trivial piece of code to write. The basic case is easy, yes. Getting everything in the spec right and then some is hard.
And, now I'm scared of establishing a TLS connection with untrusted servers after reading CVE-2021-22901 from that page. Remote code execution from an adversarial *server*. I can understand an adversarial client, but that just expanded the things of which I'm wary.
I had to implement a routine involving URL parsing in a library that is supposed to be behavior-identical across implementations in multiple languages. That was fun.
I currently parse out this stuff using a flaky little bit of python I cooked up myself and it gives me no end of grief when scaled. So many awful edge cases.
The author wrote curl, so I know it's going to do what it says it does well.
That's why it's getting love. It's a rock star dev putting out open source code many of us will absolutely be using regularly.
Way classier than yet another "product" built on the OpenAI API that will be gone in a year.
> URLs are tricky to parse and there are numerous security problems in software because of this. trurl wants to help soften this problem by taking away the need for script and command line authors everywhere to re-invent the wheel over and over.
When I was building my CI jobs at $job I needed url manipulation in shell. Had to use python inline, but it was long and ugly... trurl simplified it a bit.
Pretty dang cool, but I might have missed a feature: can trurl do multiple manipulations on a single url, as opposed to a single manipulation on multiple urls (which the blog post says is supported)?
For example, you want to apply a sequence of predefined normalization steps, such as removing the user part, converting http to https, etc etc. You put these steps in a file and then invoke trurl with that file and pointed at your url or urls. Very much like what you would do in sed, say. Possible?
Trurl is the name of a character in the science fiction short story collection "The Cyberiad" by Polish author Stanisław Lem. Trurl is a highly intelligent robot and inventor who, along with his friend Klapaucius, goes on various adventures throughout the universe.
— via CGPT
A great book that prefigured the kinds of interactions people are now having with ChatGPT. For example:
“Have it compose a poem — a poem about a haircut! But lofty, noble, tragic, timeless, full of love, treachery, retribution, quiet heroism in the face of certain doom! Six lines, cleverly rhymed, and every word beginning with the letter S!!”
Stanisław Lem’s "Cyberiad" is a great book that predicted machine learning and various modern technologies, although credit for this particular poem should probably go to Michael Kandel for his brilliant English translation. It's very different in other languages:
I’m curious about why people dislike this question. This is a specialized library; it should be testing everything including crazy edge cases that most of us haven’t even considered. Also, someone stated that there have been several URL-related security vulnerabilities in curl.