Python urllib CRLF injection vulnerability | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Python urllib CRLF injection vulnerability (coocoor.com)
	94 points by robin0 2653 days ago

9 comments

kccqzy 2653 days ago

This is far from uncommon. Back in DEFCON 2017 Orange Tsai gave a talk about inconsistencies in different URL parsing libraries in different languages. The opening example was a single URL that had a different hostname when parsed by urllib, urllib2, and requests. He also demoed examples of using unusual characters like spaces and newlines to talk to Redis or SMTP while pretending to be HTTP.

Slides: https://media.defcon.org/DEF%20CON%2025/DEF%20CON%2025%20pre...

jetru 2653 days ago

Orange actually reported this bug to urllib. The ticket in the HN link is actually a DUP of Orange's original finding

strictnein 2653 days ago

Man, that's a really good presentation.

haikuginger 2653 days ago

Python urllib3 maintainer here. urllib3 made a change to be more RFC-compliant in December, and which fixed this issue, but that change has not been released yet. We are in the process of looking into that.

I have verified that Requests, which uses us, appears to have its own handling, back at least to requests 2.0 (released in 2013) that prevents this when used directly as an abstraction layer on top of urllib3.

aldoushuxley001 2653 days ago

Interesting. I was recently debating whether to use Requests or just urllib3 directly. Figured I'd minimize dependencies by just using urllib3 but didn't think it might actually be more secure to use Requests. Great work btw!

diminoten 2653 days ago

What is your use case that would make minimizing dependencies to this extreme a valuable activity?

aldoushuxley001 2653 days ago

I was just using urllib3 to post a form on another website and get the resulting html page, then parsed it with BeautifulSoup.

Since it was just a one off use case and ultimately very simple, I didn't see the need for any more functionality. Why bother with the extra packages? Or do you think it's still worthwhile to use Requests even still? Is it not just unnecessary bloat that might slow runtime?

diminoten 2653 days ago

There's a lot to unpack in your comment, but I'll just work with the most easily verifiable thing for you; what was the response time of the resource you were querying with urllib3, and do you think using requests instead of urllib3 directly would be an order of magnitude (or two) more or less runtime?

aldoushuxley001 2653 days ago

I admittedly didn't test the response times between the two, but just felt adding additional dependencies was unnecessary. I don't realistically expect the speed to be too different between the two, but the less I have to rely on external libraries the better. If I can get the job done with urllib3 why use Requests?

Though admittedly, after reading OPs statement, I see that Requests might actually have some extra security that urllib3 alone might not have. But barring security improvements or the need for extra features that Requests has, seems like using Requests for my usecase would be adding unnecessary complexity.

aroch 2653 days ago

Not the OP, but deploying code on government servers that interact with the public web means that minimizing the required modules saves you piles of paperwork and meetings. I'd rather spend the extra two days writing/testing my own code then filling out paperwork and waiting weeks

diminoten 2653 days ago

Fair enough in general, but IMO Requests really is worth it.

cbsks 2653 days ago

The link should probably be changed to the actual bug: https://bugs.python.org/issue36276

tyingq 2653 days ago

Which appears to be a duplicate of another bug filed in 2017: https://bugs.python.org/issue30458

cbsks 2653 days ago

I just noticed that. I guess they didn't think it was actually an exploitable bug?

Edit: this bug sat around for almost 2 years, it will be interesting to see if it gets fixed now that it is getting attention on Hacker News

jaybosamiya 2653 days ago

Relevant (and super cool) previous work, done by Orange Tsai: https://www.blackhat.com/docs/us-17/thursday/us-17-Tsai-A-Ne...

1wd 2653 days ago

Python 3 urllib and other stdlib protocol modules also use `splitlines` which splits on various unicode "newlines". Could that also be exploitable somehow? https://discuss.python.org/t/changing-str-splitlines-to-matc...

peterwwillis 2653 days ago

Key takeaway: don't expect a library to do the safe thing; always sanitize all your input. (If your language supports taint mode, enabling it can prevent these bugs)

anaphor 2653 days ago

Does anyone know if this also affects the Requests library? Does it use these under the hood, or is it all httplib? (I'm pretty sure that's the case)

acdha 2653 days ago

requests uses urllib3:

https://github.com/kennethreitz/requests/blob/75bdc998e2d430...

hannob opened an issue asking about this:

https://github.com/urllib3/urllib3/issues/1553

hannob 2653 days ago

it probably does. Requests is built on top of urllib3 and the bug report mentions that urllib3 is affected as well.

fireattack 2653 days ago

Are urllib and urllib3 same thing?

jwandborg 2653 days ago

No, urllib is a standard library module in python 3. urllib3 is a 3rd-party package. See also https://news.ycombinator.com/item?id=19423367

vldo 2653 days ago

seems like an ad for coocoor

actual CVE entry: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-9740

hannob 2653 days ago

Probably worth checking other implementations. The comments already mention that urllib3 is affected as well.

cbsks 2653 days ago

I wouldn't be surprised if there are other libraries in other languages that also have the same bug.

golang had the same bug which was fixed in this commit: https://github.com/golang/go/commit/829c5df58694b3345cb5ea41...

wereHamster 2653 days ago

Why does python need three different versions of urllib?

kevin_thibedeau 2653 days ago

Urllib2 introduced breaking changes with urllib so a new lib was added to preserve the functionality of the old one. Urllib3 also has breaking changes but it purposely doesn't live in the standard so it can be changed more readily.

aftbit 2653 days ago

urllib and urllib2 are in the stdlib of Python2, and neither of them has a very friendly interface. They have been consolidated to just urllib in Python3.

urllib3 is a 3rd-party library that powers requests. It tries to offer a more powerful set of features behind a better interface.