Hacker News new | ask | show | jobs
by haikuginger 2642 days ago
Python urllib3 maintainer here. urllib3 made a change to be more RFC-compliant in December, and which fixed this issue, but that change has not been released yet. We are in the process of looking into that.

I have verified that Requests, which uses us, appears to have its own handling, back at least to requests 2.0 (released in 2013) that prevents this when used directly as an abstraction layer on top of urllib3.

1 comments

Interesting. I was recently debating whether to use Requests or just urllib3 directly. Figured I'd minimize dependencies by just using urllib3 but didn't think it might actually be more secure to use Requests. Great work btw!
What is your use case that would make minimizing dependencies to this extreme a valuable activity?
I was just using urllib3 to post a form on another website and get the resulting html page, then parsed it with BeautifulSoup.

Since it was just a one off use case and ultimately very simple, I didn't see the need for any more functionality. Why bother with the extra packages? Or do you think it's still worthwhile to use Requests even still? Is it not just unnecessary bloat that might slow runtime?

There's a lot to unpack in your comment, but I'll just work with the most easily verifiable thing for you; what was the response time of the resource you were querying with urllib3, and do you think using requests instead of urllib3 directly would be an order of magnitude (or two) more or less runtime?
I admittedly didn't test the response times between the two, but just felt adding additional dependencies was unnecessary. I don't realistically expect the speed to be too different between the two, but the less I have to rely on external libraries the better. If I can get the job done with urllib3 why use Requests?

Though admittedly, after reading OPs statement, I see that Requests might actually have some extra security that urllib3 alone might not have. But barring security improvements or the need for extra features that Requests has, seems like using Requests for my usecase would be adding unnecessary complexity.

> but just felt adding additional dependencies was unnecessary

This notion, especially in Python and HTTP client programming, is wrong and will cost you many many more hours than it will save you.

Requests is an entire order of magnitude easier to use than urllib3, and while we may be dealing in minutes for this specific scenario, you will make up for any time investment you pay to learn Requests the very next time you need to do HTTP related work in the language.

It's a matter of not overreacting to a cost, and you're paying way more than you should to get a much smaller gain than you could, if you paid that cost elsewhere (by learning/using Requests and how to manage dependencies in Python, which you have to do anyway with bs4).

Not the OP, but deploying code on government servers that interact with the public web means that minimizing the required modules saves you piles of paperwork and meetings. I'd rather spend the extra two days writing/testing my own code then filling out paperwork and waiting weeks
Fair enough in general, but IMO Requests really is worth it.