| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Kwpolska 90 days ago
	What is it about Python that makes developers love fragmentation so much? Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client. But not in Python, stdlib only has the ugly urllib.request, and everyone is using third party stuff like requests or httpx, which aren't always well maintained. (See also: packaging)

13 comments

dirkc 90 days ago

You would think that sending HTTP requests is a basic capability, but I've had fun in many languages doing so. Long ago (2020, or not so long ago, depending on how you look at it) I was surprised that doing an HTTP request on node using no dependencies was a little awkward:

  const response = await new Promise( (resolve, reject) => {
    const req = https.request(url, {
    }, res => {
      let body = "";
      res.on("data", data => {
        body += data;
      });
      res.on('end', () => {
        resolve(body);
      });
    });
    req.end();
  });

wging 90 days ago

These days node supports the fetch API, which is much simpler. (It wasn't there in 2020, it seems to have been added around 2022-2023.)

dirkc 90 days ago

Yes, thankfully! It's amusing to read what they say about fetch on nodejs.org [1]:

> Undici is an HTTP client library that powers the fetch API in Node.js. It was written from scratch and does not rely on the built-in HTTP client in Node.js. It includes a number of features that make it a good choice for high-performance applications.

[1] - https://nodejs.org/en/learn/getting-started/fetch

Pay08 90 days ago

Why is it amusing?

dirkc 89 days ago

I say amusing because it points out that something I (and many other people) assume to be basic clearly has a lot more nuance to it.

b450 90 days ago

Note that node-fetch will silently ignore any overrides to "forbidden" request headers like Host, since it's designed for parity with fetch behavior in the browser. This caused a minor debugging headache for me once.

rzmmm 90 days ago

Web standards have rich support for incremental/chunked payloads, the original node APIs are designed around it. From this lens the Node APIs make sense.

simlevesque 90 days ago

And you don't handle errors at all...

niekiepriekie 89 days ago

It's node. Why do errors if you can simply ignore them.

dirkc 89 days ago

Left as an exercise for the reader... ;p

ivanjermakov 90 days ago

HTTP client is at the intersection of "necessary software building block" and "RFC 2616 intricacies that are hard to implement". Has nothing to do with Python really.

maccard 90 days ago

> Then I found out it was broken. I contributed a fix. The fix was ignored and there was never any release since November 2024.

This seems like a pretty good reason to fork to me.

> Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client. But not in Python,

Or Javascript (well node), or golang (http/net is _worse_ than urllib IMO), Rust , Java (UrlRequest is the same as python's), even dotnet's HttpClient is... fine.

Honestly the thing that consistently surprises me is that requests hasn't been standardised and brought into the standard library

francislavoie 90 days ago

What, Go's net/http is fantastic. I don't understand that take. Many servers are built on it because it's so fully featured out of the box.

maccard 90 days ago

The server side is great. Sending a http request is… not

lenkite 90 days ago

Your java knowledge is outdated. Java's JDK has a nice, modern HTTP Client https://docs.oracle.com/en/java/javase/11/docs/api/java.net....

ffsm8 90 days ago

Ahh, java. You never change, even if you're modern

    HttpClient client = HttpClient.newBuilder()
        .version(Version.HTTP_1_1)
        .followRedirects(Redirect.NORMAL)
        .connectTimeout(Duration.ofSeconds(20))
        .proxy(ProxySelector.of(
           new InetSocketAddress("proxy.example.com", 80)
        ))
        .authenticator(Authenticator.getDefault())
        .build();

       HttpResponse<String> response = client.send(request, BodyHandlers.ofString());

       System.out.println(response.statusCode());
       System.out.println(response.body());

For the record, you're most likely not even interacting with that API directly if you're using any current framework, because most just provide automagically generated clients and you only define the interface with some annotations

awkwardpotato 90 days ago

What's the matter with this? It's a clean builder pattern, the response is returned directly from send. I've certainly seen uglier Java

freedomben 90 days ago

Just my opinion of course, but:

> What's the matter with this?

To me what makes this very "Java" is the arguments being passed, and all the OOP stuff that isn't providing any benefit and isn't really modeling real-world-ish objects (which IMHO is where OOP shines). .version(Version.HTTP_1_1) and .followRedirects(Redirect.NORMAL) I can sort of accept, but it requires knowing what class and value to pass, which is lookups/documentation reference. These are spread out over a bunch of classes. But we start getting so "Java" with the next ones. .connectTimeout(Duration.ofSeconds(20)) (why can't I just pass 20 or 20_000 or something? Do we really need another class and method here?) .proxy(ProxySelector.of(new InetSocketAddress("proxy.example.com", 80))), geez that's complex. .authenticator(Authenticator.getDefault()), why not just pass bearer token or something? Now I have to look up this Authenticator class, initialize it, figure out where it's getting the credentials, how it's inserting them, how I put the credentials in the right place, etc. The important details are hidden/obscured behind needless abstraction layers IMHO.

I think Java is a good language, but most modern Java patterns can get ludicrous with the abstractions. When I was writing lots of Java, I was constantly setting up an ncat listener to hit so I could see what it's actually writing, and then have to hunt down where a certain thing is being done and figuring out the right way to get it to behave correctly. Contrast with a typical Typescript HTTP request and you can mostly tell just from reading the snippet what the actual HTTP request is going to look like.

looperhacks 90 days ago

> but it requires knowing what class and value to pass

Unless you use a text editor without any coding capabilities, your IDE should show you which values you can pass. The alternative is to have more methods, I guess?

> why can't I just pass 20 or 20_000 or something

20 what? Milliseconds? Seconds? Minutes? While I wouldn't write the full Duration.ofSeconds(20) (you can save the "Duration."), I don't understand how one could prefer a version that makes you guess the unit.

> proxy(ProxySelector.of(new InetSocketAddress("proxy.example.com", 80))), geez that's complex

Yes it is, can't add anything here. There's a tradeoff between "do the simple thing" and "make all things possible", and Java chooses the second here.

> .authenticator(Authenticator.getDefault()), why not just pass bearer token or something?

Because this Authenticator is meant for prompting a user interactively. I concur that this is very confusing, but if you want a Bearer token, just set the header.

Pay08 90 days ago

> why can't I just pass 20 or 20_000 or something? Do we really need another class and method here?

If you've ever dealt with time, you'll be grateful it's a duration and not some random int.

colejohnson66 90 days ago

The boilerplate of not having sane defaults. .NET is much simpler:

    using HttpClient client = new();
    HttpResponseMessage response = await client.GetAsync("https://...");
    if (response.StatusCode is HttpStatusCode.OK)
    {
        string s = await response.Content.ReadAsStringAsync();
        // ...
    }

lmz 90 days ago

That's just an example. It does have defaults: https://docs.oracle.com/en/java/javase/11/docs/api/java.net.... (search for "If this method is not invoked")

pjmlp 90 days ago

Yeah, so much simpler,

"Common IHttpClientFactory usage issues"

https://learn.microsoft.com/en-us/dotnet/core/extensions/htt...

"Guidelines for using HttpClient"

https://learn.microsoft.com/en-us/dotnet/fundamentals/networ...

And this doesn't account for all gotchas as per .NET version, than only us old timers remember to cross check.

PxldLtd 90 days ago

Yeah this is all over Rust codebases too for good reason. The argument is that default params obfuscate behaviour and passing in a struct (in Rust) with defaults kneecaps your ability to validate parameters at compile time.

Pay08 90 days ago

It does have defaults, the above example manually sets everything to show people reading the docs what that looks like.

zahlman 89 days ago

> What's the matter with this? It's a clean builder pattern

I feel like you answered yourself. Java makes you do this by not supporting proper keyword arguments.

lenkite 90 days ago

Your http client setup is over-complicated. You certainly don't need `.proxy` if you are not using a proxy or if you are using the system default proxy, nor do you need `.authenticator` if you are not doing HTTP authentication. Nor do you need `version` since there is already a fallback to HTTP/1.1.

  HttpClient client = HttpClient.newBuilder()
    .followRedirects(Redirect.NORMAL)
    .connectTimeout(Duration.ofSeconds(20))
    .build();

ffsm8 90 days ago

It was literally just copy pasted from the linked source (the official Oracle docs)

Tostino 90 days ago

And those docs were likely trying to show you how to use multiple features, not the most basic implementation of it

umvi 90 days ago

What's wrong with Go's? I've never had any issues with it. Go has some of the best http batteries included of any language

jerf 89 days ago

Go's net/http Client is built for functionality and complete support of the protocol, including even such corner cases as support for trailer headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/... Which for a lot of people reading this message is probably the first time they've heard of this.

It is not built for convenience. It has no methods for simply posting JSON, or marshaling a JSON response from a body automatically, no "fluent" interface, no automatic method for dealing with querystring parameters in a URL, no direct integration with any particular authentication/authorization scheme (other than Basic Authentication, which is part of the protocol). It only accepts streams for request bodys and only yields streams for response bodies, and while this is absolutely correct for a low-level library and any "request" library that mandates strings with no ability to stream in either direction is objectively wrong, it is a rather nice feature to have available when you know the request or response is going to be small. And so on and so on.

There's a lot of libraries you can grab that will fix this, if you care, everything from clones of the request library, to libraries designed explicitly to handle scraping cases, and so on. And that is in some sense also exactly why the net/http client is designed the way it is. It's designed to be in the standard library, where it can be indefinitely supported because it just reflects the protocol as directly as possible, and whatever whims of fate or fashion roll through the developer community as to the best way to make web requests may be now or in the future, those things can build on the solid foundation of net/http's Request and Response values.

Python is in fact a pretty good demonstration of the risks of trying to go too "high level" in such a client in the standard library.

Orygin 90 days ago

I guess he never used Fiber's APIs lol

The stdlib may not be the best, but the fact all HTTP libs that matter are compatible with net/http is great for DX and the ecosystem at large.

maccard 90 days ago

Thr comment I replied to was talking about sending a http requests. Go’s server side net/http is excellent, the client side is clunky verbose and suffers from many of the problems that Python’s urllib does.

localuser13 90 days ago

>Honestly the thing that consistently surprises me is that requests hasn't been standardised and brought into the standard library

Instead, official documentation seems comfortable with recommending a third party package: https://docs.python.org/3/library/urllib.request.html#module...

>The Requests package is recommended for a higher-level HTTP client interface.

Which was fine when requests were the de-facto-standard only player in town, but at some point modern problems (async, http2) required modern solutions (httpx) and thus ecosystem fragmentation began.

Spivak 90 days ago

Well, the reason for all the fragmentation is because the Python stdlib doesn't have the core building blocks for an async http or http2 client in the way requests could build on urllib.

The h11, h2, httpcore stack is probably the closest thing to what the Python stdlib should look like to end the fragmentation but it would be a huge undertaking for the core devs.

zahlman 89 days ago

> but it would be a huge undertaking for the core devs.

More importantly, it would be massively breaking to remove the existing functionality (and everyone would ignore a deprecation), and confusing not to (much like it was when 2.x had both "urllib" and "urllib2").

It'd be nice to have something high level in the standard library based on urllib primitives. Offering competition to those, not so much.

Kwpolska 90 days ago

Node now supports the Fetch API.

pjc50 90 days ago

> dotnet's HttpClient is... fine.

Yes, and it's in the standard library (System namespace). Being Microsoft they've if anything over-featured it.

xnorswap 90 days ago

It's fine but it's sharp-edged, in that it's recommended to use IHttpClientFactory to avoid the dual problem of socket exhaustion ( if creating/destroying lots of HttpClients ) versus DNS caching outliving DNS ( if using a very long-lived singleton HttpClient ).

And while this article [1] says "It's been around for a while", it was only added in .NET Framework 4.5, which shows it took a while for the API to stabilise. There were other ways to make web requests before that of course, and also part of the standard library, and it's never been "difficult" to do so, but there is a history prior to HttpClient of changing ways to do requests.

For modern dotnet however it's all pretty much a solved problem, and there's only ever been HttpClient and a fairly consistent story of how to use it.

[1] https://learn.microsoft.com/en-us/dotnet/core/extensions/htt...

pixl97 90 days ago

>"It's been around for a while"

is 14 years not a while?

xnorswap 90 days ago

It is, but it's also a decade after the language was first released.

Kwpolska 89 days ago

Python’s urllib2 (now urllib.request) started out in the year 2000 [0].

.NET’s WebRequest was available in .NET Framework 1.1 in 2003 [1].

But since then, Microsoft noticed the issues with WebRequest and came up with HttpClient in 2012. It has some issues and footguns, like those related to HttpClient lifetime, but it’s a solid library. On the other hand, the requests library for Python started in 2011 [2], but the stdlib library hasn’t seen many improvements.

[0] https://github.com/python/cpython/blob/6d7e47b8ea1b8cf82927d...

[1] https://learn.microsoft.com/en-us/dotnet/api/system.net.webr...

[2] https://github.com/psf/requests/blob/main/HISTORY.md#001-201...

gjvc 90 days ago

requests is some janky layer onto of other janky layers. last thing you want in the stdlib.

it's called the STD lib for a reason...

thedanbob 90 days ago

> Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client.

I've noticed that many languages struggle with HTTP in the standard library, even if the rest of the stdlib is great. I think it's just difficult to strike the right balance between "easy to use" and "covers every use case", with most erring (justifiably) toward the latter.

tclancy 90 days ago

Don't think it's Python-specific, it's humanity-specific and Python happens to be popular so it happens more often/ more publicly in Python packages.

LtWorf 90 days ago

The HTTP protocol is easy to implement the basic features but hard to implement a full version that is also efficient.

I've often ended up reimplementing what I need because the API from the famous libraries aren't efficient. In general I'd love to send a million of requests all in the same packet and get the replies. No need to wait for the first reply to send the 2nd request and so on. They can all be on the same TCP packet but I have never met a library that lets me do that.

So for example while http3 should be more efficient and faster, since no library I've tried let me do this, I ended up using HTTP1.1 as usual and being faster as a result.

mesahm 90 days ago

I spend 3 years developing Niquests, and believe me, HTTP is far from easy. Being a client means you have to speak to everyone, and no one have to speak to you (RFC are nice, but in practice never applied as-is). Once you go deep under the implementation, you'll find a thousand edge cases(...). And yes, the myth that as developer http/1 is "best" only means that the underlying scheduler is weak. today, via a dead simple script, you'll see http/2+ beat established giant in the http/1 client landscape. see https://gist.github.com/Ousret/9e99b07e66eec48ccea5811775ec1... if you are curious.

LtWorf 90 days ago

I never said i was using asyncio

functionmouse 90 days ago

Bram's Law: https://files.catbox.moe/qi5ha9.png

Python makes everything so easy.

fsckboy 90 days ago

converted to text:

I realized this the other day, and dub it Bram's Law -- Bram

Bram's Law

The easier a piece of software is to write, the worse it's implemented in practice. Why? Easy software projects can be done by almost any random person, so they are. It's possible to try to nudge your way into being the standard for an easy thing based on technical merit, but that's rather like trying to become a hollywood star based on talent and hard work. You're much better off trading it all in for a good dose of luck.

This is why HTTP is a mess while transaction engines are rock solid. Almost any programmer can do a mediocre but workable job of extending HTTP, (and boy, have they,) but most people can't write a transaction engine which even functions. The result is that very few transaction engines are written, almost all of them by very good programmers, and the few which aren't up to par tend to be really bad and hardly get used. HTTP, on the other hand, has all kinds of random people hacking on it, as a result of which Python has a 'fully http 1.1 compliant http library which raises assertion failures during normal operation.

Remember this next time you're cursing some ubiquitous but awful third party library and thinking of writing a replacement. With enough coal, even a large diamond is unlikely to be the first thing picked up. Save your efforts for more difficult problems where you can make a difference. The simple problems will continue to be dealt with incompetently. It sucks, but we'll waste a lot less time if we learn to accept this fact.

woodruffw 90 days ago

AFAICT, lacking a (good) standard HTTP library is kind of the norm in popular languages. Python, Ruby, Rust, etc. all either have a lackluster standard one or are missing one. I think it sits between two many decision pressures for most languages: there are a _lot_ of different RFCs both required and implied, lots of different idioms you could pick for making requests, lots of different places to draw the line on what to support, etc.

The notable exception is Go, which has a fantastic one. But Go is pretty notable for having an incredible standard library in general.

Kwpolska 89 days ago

I thought Rust’s got a very small standard library, only focusing on things that must be in a standard library, mainly primitives or things which require co-operation with the underlying OS (e.g. thread and process management)? That’s completely opposite of Python’s “batteries included” approach.

woodruffw 89 days ago

Sure, I'm not making a categorical argument about big vs. small stdlibs. I'm just noting that "a good default HTTP library" is in fact kind of unusual, whether or not the language is batteries-included or not.

(As an outsider I had the impression that Go's net/http was good, but a lot of people in this thread are complaining about it as well. So it may be 0-4 instead of 1-3).

Pay08 90 days ago

Is Rust popular? It's popular among HN users, and among certain other bubbles, but can it be called generally popular? Ruby sure can't be.

woodruffw 90 days ago

It's popular enough to be worth using as a datapoint. What's the point of the question?

Pay08 89 days ago

I don't think it is worth using as a datapoint. Webdev is simply not what Rust was made for. It'd be somewhat like PHP having inline assembly.

woodruffw 89 days ago

I don't think this is relevant on three grounds:

1. Whether or not it was "made for" webdev, people do use Rust for that.

2. Plenty of people write networked Rust that interacts with HTTP. That code requires an HTTP stack, even if it isn't web development.

3. Like all of the other examples, Rust does have an excellent third-party HTTP stack (reqwest and its underpinnings). So it's not like Rust fails to do HTTP.

paulddraper 89 days ago

Web browsers -- LIKE THE THINGS THAT LIVE AND DIE ON HTTP -- didn't have an ergonomic HTTP API until 2017.

Node.js got its production version in 2023.

Rust doesn't include an HTTP client at all.

Even for stdlib that have a client, virtually none support HTTP/3, which is used for 30% of web traffic. [1]

--

HTTP (particularly 2+) is a complex protocol, with no single correct answers for high-level and low-level needs.

[1] https://radar.cloudflare.com/adoption-and-usage

matheusmoreira 90 days ago

Everybody's got a different idea of what it means for a library to be "friendly" and "fully-featured" though. It's probably better to keep the standard library as minimal as possible in order to avoid enshrining bad software. Programming languages could have curated "standard distributions" instead that include all the commonly used "best practice" libraries at the time.

duskdozer 90 days ago

https://xkcd.com/927/

zahlman 89 days ago

That isn't really what was proposed, and is an unnecessarily snarky way to respond.

duskdozer 88 days ago

Sorry, snark not intended.

matheusmoreira 90 days ago

That situation should be avoided. People should have to create their own libraries until everyone empirically converges into a de facto standard that can then be made official.

BigTTYGothGF 90 days ago

I think the python maintainers are still feeling burnt by the consequences of the "batteries included" approach from the old times.

yoyohello13 90 days ago

Most Python developers these days weren't even programming when the 2 -> 3 split happened. Unless you're referencing something else.

zahlman 89 days ago

There are quite a few old hands among Python core devs. Certainly the culture of that burnout is in place, if you look at the responses that proposals for new standard library additions get these days. There also seems to be a lot of trauma from the loud complaints about backward compatibility breaks.

I still hear people complain about how such and such removal between "minor versions" of Python 3 (you really should be thinking of them as major versions nowadays — "Python 3 is the brand", the saying goes now), where they were warned like two years in advance about individual functions, supposedly caused a huge problem for them. It's hard for me to reconcile with the rhetoric I've heard in internal discussions; they're so worried in general about possible theoretical compatibility breaks that it seems impossible to change anything.

denimnerd42 90 days ago

the batteries included approach is the stdlib that can do everything. turns out it’s hard to maintain and make good.

yoyohello13 90 days ago

Yeah that's true. Go seems to be handling the 'fat stdlib' approach pretty well though. I really don't want Python to got the path of Rust where nothing is included.

denimnerd42 89 days ago

I feel like Java does it the best. Golang didn't start with generics so it's a bit odd IMO.

WhyNotHugo 90 days ago

httpx has async support (much like aiohttp), whereas urllib is blocking-only. If you need to make N concurrent requests, urllib requires N threads or processes.

kurtis_reed 90 days ago

Python doesn't have a big company behind it