Hacker News new | ask | show | jobs
by hmnom 2498 days ago
It's written in Python so as soon as they have enough users it will be as slow as any webpage with tons of JS. Thankfully, that CPU load won't be on the client side... so it's still an improvement
5 comments

I recommend you relax your convictions about Python performance a bit.

Server-side bottlenecks are more often than not database reads/writes and other IO. And the few CPU-intensive operations can be delegated to libraries written in C.

Pure Python is slow for CPU-intensive tasks, but that doesn't mean that a Python webserver is necessarily slow.

So what you're saying is that the author of Sourcehat is expected to rewrite Flask, Jinja2, etc in C?

As an example, https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/gitsrht takes 600-800 ms to generate, and it's probably heavily cached already. What will happen when they get more users? The site will be unbearably slow, unless the guy starts spending thousands in servers.

I imagine that the CPU-intensive parts of Flask and Jinja2 are already written in C. Much of Python's standard library is.

800ms is a reasonable response time, and if they scale up according to their userbase, they will hopefully maintain that time.

(Also, we don't know how much of that 800ms is Python vs. IO)

https://github.com/pallets/flask

https://github.com/pallets/jinja

0% C

Also 800 ms is NOT a reasonable response time to generate what basically is a bunch of text, that is absurd, but I guess this is the baseline in 2019.

I trust all IO is cached. The author can confirm it. This is just how slow Python is.

Almost none of the IO of that page is cached actually[0]. The only thing I'm sure that is cached is the templates themselves. I'm pretty sure that neither git lookups, nor DB accesses are cached, where you can save time. And mind you, this is served from a single data center in USA, and the latency can already eat up a lot of that. I in Europe have 300ms ping to it, so it might be that you are simply far away from the physical location.

[0] https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/gitsrht/bl...

Switching to PyPy is likely to improve the overall performance if it's not database or I/O bound.
This. There are also several optimized versions of python, other that stock, allowing you to increase performance based on your needs. I've shipped with many of them over the years.
I used to run an image board written in Python with around 15m PV/month on a Pentium 4 machine in 2004 or so. The first bottleneck I found was the database. Some query optimization and caching fixed it very quickly. It goes a really long way until Python become a bottleneck, of which can also be fixed relatively easily (image processing in my case, which is fixed by moving the whole thing to a worker).
Are you sure? Looks like static pre-generated page to me:

  $ curl -s https://sourcehut.org/|egrep 'meta.*gen'
  <meta name="generator" content="Hugo 0.57.2" />
[0] https://gohugo.io/
The website/blog (https://sourcehut.org) is static, the Sourcehut app itself (https://sr.ht) is Python-based.

https://git.sr.ht/~sircmpwn/?search=sr.ht

Python isn't slow. Running a lot of Python is slow.
If it shards well and you have the capital to buy hardware, there's nothing wrong with using python