Hacker News new | ask | show | jobs
by agolio 1272 days ago
> Python’s documentation sucks

I can't agree with this. I have always had a very good experience with python documentation. One can use the built-in "help" function which works seamlessly with the docstring feature of the language. The complaint in the blog post seems to refer to the UI of the website missing a table of contents for functions. Yeah sure they could add that but I don't see it as a big point.

> Python’s package management sucks

Hmm, it has some weaknesses but I wouldn't really say it sucks. Going deeper ->

> Every project seems to use a different tool and it’s a massive headache. Off the top of my head there’s ...

Don't all of these use pip under the hood? I personally use the lower-level pip and virtualenv tools, but some others enjoy the convenience of poetry. That's a bit of personal preference. It's a bit more akin to an IDE choice than a feature of the language. None of conda, poetry, etc. are core Python features.

> Python’s standard library sucks

I have to disagree again, I think it is pretty well designed and minimal on purpose. The community additions of numpy, etc. are by design not part of the core language to reduce bloat.

> Python is slow

OK sure, it is slower than compiled languages like C++, that is a concession we make when opting for the ease of readability, writability, usability, etc.

> Python is huge, the python:3.9-slim Docker image is 118MB

Hmmm. 118MB isn't really that big anymore. The docker image would presumably be cached and reused in a deployment pipeline.

> Python syntax sucks.

This one I can't even understand the reasoning for. The python syntax is what people love about it. I personally dislike the walrus operator, f-strings, and some of the other newer features (did they add the switch statement yet?) but those are my only gripes. And they are more pet-peeves / personal preferences than complaints.

9 comments

I have to agree. I have a lot of gripes with Python, but the documentation and the standard library are generally great.

The author favors Javascript while deriding Python for its syntax, type-hints, and standard library, and favors Go while deriding a Python Docker image's size and documentation. I feel like the author must use Python in a very different way than I do for Javascript and Go to be the winners in these categories.

I do agree without reservation that package management and dependencies are horrible with Python.

I have a lot of gripes with Python, but the documentation and the standard library are generally great.

It depends what you're comparing them against. Python's style of documentation - for the language itself and for many of the popular libraries that follow the same style - is mostly reference material and often incomplete. It's very lacking in examples of usage. It almost completely ignores types. It often doesn't appear in search engine results for relevant keywords leading to spending several minutes brute force searching the official docs site to find something that should have been a 10 second search. Perhaps the most obvious comparison is with the JavaScript/TypeScript world, which has embraced both types and different kinds of documentation and as a result gives a much better developer experience in those areas today.

The standard library in Python is strange because it has a lot of content but much of that content just isn't very good. Entire packages in the standard library are largely ignored in favour of some de facto standard package from PyPI that does the same job much better. Some of the packages for working with different protocols and file formats are useful in the right circumstances but they're so slow that they're not suitable for many applications and again you end up pulling in a better alternative. Meanwhile common data structures and algorithms that you might look for in any modern language's toolbox are scarce to non-existent and the ones that do exist don't always compose easily.

It definitely varies from library to library. The urllib.request example provided is pretty horrible, and it's simply just been in a blind spot in my eye.

I also agree with you on typing, searchability, and the problem of standards vs defacto standards. (The page for `urllib.request` points you to `requests`, which is good! But the page for `array` has a link to `numpy` only at the very bottom.) I also don't have the JS/TS experience to compare against, but I believe you there. (The MDN alone is excellent.)

I think my experience is biased from spending a lot of time in deep learning. (Keras and Pytorch have a bit of a competition for having good docs). The adjacent libraries, like Numpy or Pandas or stdlib like `socket`, have also been good in my experience. (Perhaps these benefit from having relatively 'obvious' types for most functions. One might infer the types and dimensions of `numpy.matmul(x1, x2)` easily, whereas I have no idea what the types of the args in `Request.set_proxy(host, type)` are.)

It's a shame this post was flagged, because I've had a lot of blind spots uncovered in this thread!

I, a python dev with a decade of experience sometimes still need half a day to figure out some weird dependency venv import-path issue. This happens often.

It takes me half a day to figure out a reliable way to cross compile rust binaries for Raspberry Pis, withput ever having done this before.

There are a lot of good things about python but the dev environment sucks a lot and it is so engrained in the very substance of the language/tooling that I don't really see a path out of that other than going all Python 4.0 on it and repeating the python2/python3 schism all over again.

Python is still my goto language but it is my goto language despite the tooling, not because of it.

Python's documentation is good, but it's a departure from expectations if you're used to Javadoc-style documentation of APIs. You can get just that by providing good docstrings in your library's code or using documentation generators like Sphinx and whatnot.

Agree with the rest of your points, though, except not liking newer syntax. Newer syntax, in my experience, makes writing code less of a chore and mirrors the developer conveniences in other modern languages.

> OK sure, it is slower than compiled languages like C++, that is a concession we make when opting for the ease of readability, writability, usability, etc.

Does not being compiled really help with readability? How? After all, one can compile python to machine code, and there are C++ interpreters [1] (I have not heard any claims that using it makes C++ more readable). Then there are very readable/usable languages such as Haskell that come out of the box with both a compiler and interpreter.

To be more specific: which features does the absence of the possibility of compilation [2] enable?

[1] https://root.cern/cling/

[2] Since interpretation and compilation are not mutually-exclusive for a language.

You are quite right it is not just compilation that sets Python apart from C++, that was a bit of a simplification on my part.

There is also tools such as Cython and numba (JIT) which use various techniques to compile Python code btw. But I am generally in favour of switching to a high performance language or writing in C++ and then importing in Python at that point, personal preference again...

Interesting to read about cling, will have to play around with that.

> You are quite right it is not just compilation that sets Python apart from C++, that was a bit of a simplification on my part.

That's how I understood your post. Sorry if my post came off as a correction - it was not meant as such, but as a question (not specifically about Python or C++):

How does a language benefit from not having the option to compile it? What restrictions does the requirement to be able to produce a machine-code executable place on it?

Because naively, I would say it should have no effect - one could package the interpreter and source-code into a single file, and think of that as a (very unoptimized) "compiled executable".

It makes debugging and R&D really easy.

For example say I have a script:

  data = slow_computation()
  metrics = produce_metrics(data)
Now later I want to play with some metrics and experiment a bit I can do that in the interactive shell (running with python -i file.py will run the script as normal but leave a python interpreter open at the end with all the variables kept alive)

When I am happy with my experimentation results, say, I have found something interesting about the data and written a new function cool_metric(metrics) I can commit it to the script at the end

  data = slow_computation()
  metrics = produce_metrics(data)
  cool_metric = cool_metric(metrics)
There's also the ability to drop in breakpoints where you can then have the full python shell available, can break out of the debugger into an interactive shell if you want, or can modify a variable and then continue the script as normal. I think you can do that with GDB for instance but it's not quite as flexible if I am not mistaken?

so if I have

  def some_buggy_function(args):
      data = analyse(args)
      breakpoint()
      new_data = further_analyse(data)
that can be quite powerful in making debugging easy
> How does a language benefit from not having the option to compile it? What restrictions does the requirement to be able to produce a machine-code executable place on it?

“Compiled” is being used as a short-hand for “compiled with optimisations”, so yes, an unoptimised build wouldn’t count here.

The design decision is around when (between the code being written and being run) is the final decision made about what exact code will run. If that is known really early (static types, no dynamic dispatch) then optimisations can be made early too. If it is really late (polymorphic methods, support for redefining types etc) then optimisation needs to run very quickly (“just in time”) or not at all.

If you want to go deeper on this, look at performance optimisation in Julia. It has the same LLVM backend as C/Rust, so can use all the same optimisations, but it is arguably a more dynamic and easier to use language than Python, so when and how optimisations apply really depends on how the code itself is written. As a bonus, it has some great tooling to see how changes to the code impact performance.

>> Python’s standard library sucks

> I have to disagree again, I think it is pretty well designed and minimal on purpose

Oh come on. That’s like saying “this buffet has no food selection but that’s good cause I’m on a diet”. Meaning that you like that it sucks (that’s fine).

Numpy is not the kind of thing you would include in the std lib anyway.

> OK sure, it is slower than compiled languages like C++, that is a concession we make when opting for the ease of readability, writability, usability, etc.

Fair, but I’ve spent enough time python environment/dependency hell that a lot of those gains (which are mostly cognitive) weren’t worth it for me.

Numpy is not the kind of thing you would include in the std lib anyway.

If I could add one thing to the python standard library it would be unify the python std lib array and the numpy array and make many of the most common and useful numpy array methods available in the std lib.

yeah python is alright, it's not a masterpiece, but it's well rounded

I ported a golang (non concurrent) program that was 3 files and lots of intermediate procedures over custom types (synchronized maps)

it thing fit beautifully in a one page python script, very readable. bonus: concurrency (threadpool over queues). the only libs were click and rich but not relevant to the design

and the fact that it's repl-able you can quickly navigate in the stdlib.

ps: python docs are not amazing but they have a low signal/noise/confusion ratio. if one wants to suffer go "read" the javadoc, with it's empty package descriptions, extreme redudancy of autogenerated get/set * polymorphic methods. Or to stay in the python world, django docs. They're abysmally non technical.. it's full on guess fest to infer the relationships between classes and dataflow. Superb.

>Don't all of these use pip under the hood?

The point is that every extra tool you need to work on a project is another thing you need to get installed and working. It's another thing that will rot over time. In 5 years it might be totally unusable and broken.

When I clone a project, I want to be productive with it as quickly as possible.

Yeah, but the author is comparing it to JavaScript. How is the tooling less of a mess in JavaScript?
I am the author. npm/yarn are pretty much the defacto package managers for JS, so that's one less nightmare to deal with. But I never really said that the tooling is better in the JS ecosystem. The better performance is just enough to tip the scales in favor of it.
And pip is the defacto package manager for Python, and it's built in, and it just works, and venv is the defacto virtual environment manager, and it's built in, and it just works. And yes, there are other abstractions built on top of pip and venv, and some of them, like Poetry, work incredibly well.

But JavaScript is in no way different in this regard: Bit, Yarn, PNPM, Turbo, NX, Rush. If anything, JS has more tooling chaos than Python. And don't get me started on webpack and all its alternatives. And then there is the repository chaos, umpteen abandoned packages, repeated instances of malware injection (which has also happened in the cheeseshop, I know, but not nearly as often).

In the end, you choose a tool, and use it until you decide there is a better tool, and you get to work. I used to use only pip and venv. For a while I used virtualenvwrapper. Now I use Poetry with asdf, and it is a very pleasant experience.

The better JavaScript performance doesn't matter in 90% of use cases. In nearly every real-world bottleneck, the answer is not to switch ecosystems, but to optimize. How much traction has JS gotten in workloads where numpy and tensorflow rule the roost?

And you don't like Python's syntax. That's your prerogative. I love it, and so do millions of other python devs.

>> Python is huge, the python:3.9-slim Docker image is 118MB

> Hmmm. 118MB isn't really that big anymore. The docker image would presumably be cached and reused in a deployment pipeline.

Caching aside, it seems that the Python slim image is built on Debian, which will usually have slightly bigger container sizes than something like Alpine, which is comparatively more lightweight/barebones:

  > docker pull python:3.9-slim && docker image ls
  REPOSITORY   TAG        IMAGE ID       CREATED        SIZE
  python       3.9-slim   e2f464551004   8 days ago     125MB

  > docker run --rm python:3.9-slim sh -c "cat /etc/*-release"
  PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
  NAME="Debian GNU/Linux"
  
  > docker pull python:3.9-alpine3.17 && docker image ls
  REPOSITORY   TAG              IMAGE ID       CREATED        SIZE
  python       3.9-slim         e2f464551004   8 days ago     125MB
  python       3.9-alpine3.17   d6d1ed462b20   3 weeks ago    48.8MB
  
  > docker run --rm python:3.9-alpine3.17 sh -c "cat /etc/*-release"
  3.17.0
  NAME="Alpine Linux"
Now, I'm not saying that Alpine is the perfect base distro for your container images, despite it generally being a reasonable pick (I've heard some stories about Python performance in particular, and sometimes there are package related oddities), but the distro that you choose will most definitely have an effect on what you'll ship.

Of course, caching and any additional tools that you may or may not want to include also plays a part. Personally, I just went for maximum predictability and usability, and now build most of my personal container images basing them on Ubuntu LTS, with some common CLI tools included for debugging: https://blog.kronis.dev/articles/using-ubuntu-as-the-base-fo...

Just use whatever works for you, but rest assured that JDK and other stacks will typically also have some overhead to them. Python might not be the worst offender here. Something like Go with compiled binaries is still very nice, granted.

I have always had a very good experience with python documentation.

While I generally do like the python documentation, I often find that it likes to explain exactly how and why a library works in detail. This is great if you want to really learn the library and often I miss this level of detail in for example JavaScript, but if you just quickly want to find the most obvious way to do the most common thing, it can be quite annoying.

Python class syntax sucks. Other syntax is mostly nice. List comprehensions etc are great, I really wish I had those in c++. Plus, I wish I had a similarly useful standard library, repl, and package manager in c++.
you want `self` less class methods ?
Yeah. And I don’t want confusing initialisers or constructors or whatchamacallit, and weird calls to super, and underscore methods. Maybe the language should just be minimally aware of classes and how they work, and build that into the semantics of the language using keywords, rather than giving me building blocks that make it feel like I’m rebuilding class semantics from scratch. It’s like when it comes to classes, Java is cleaner and simpler than python. Which is weird, cuz python is simpler and more fun to work with on all other fronts (well, almost, concurrency and threading is also kind of easier in Java).

…I really like python, but man it’s warts are annoying.

When I first started using python (somewhere before the 2to3 migration) I was extremely pissed at the object layer.

I hate redundancy and the dunder and explicit self parameters completely stupid. With time I just got used to them .. (thanks partly to editor templates). The private field shenaningans weren't sexy either..

Other than that I agree.. it would be worth a python4

   class Foo(Bar):

      new(*a, **kw):
         "keyword: new"
         super(*a, **kw)
         print('v4')
      
      bar(a,b,adjust=1):
         "no self, shorthand super.parent_method()"
         return super.bar(a, b) + adjust
come on guido ;)
I'd wish there to be simply keywords that define public/private class/instance methods and variables.