Hacker News new | ask | show | jobs
by danpalmer 2142 days ago
90% of the time I don't want it. Database, cache, etc, not really that bothered. Web requests take 100-300ms to complete, tying up a worker for 300ms isn't much of a problem.

10% of the time I'm calling an API that takes 3s and tying up a worker for 3s _might_ be a problem. Being able to not do that would be really handy sometimes.

Not web servers, but I also do a lot of web scraping and Python is definitely the best tool I've used for that job (lxml is fast with great XPath support, Python is very expressive for data manipulation), using async for that could dramatically improve our throughput as it's essentially network bound, and we don't really care about latency that's in the 10s of seconds.

Source: I work on a large production Django site.

1 comments

Web requests taking over 100 ms is an absolute shame that slow languages like python are enabling.
I'd love the site to be faster, but it's very hard to do this. For an API called while serving a user request, 100ms is slow, but for the frontend that a user hits directly, it's fairly typical.

As a point of comparison, Amazon's time to first byte for me is 270ms, with a 15ms round-trip time to their servers, so they're looking at about 255ms to serve a page.

To get significantly faster than this, a site must be engineered for speed from the ground up, and the productivity hit would be huge. We've got ~250k lines of Python, which would probably translate to ~750k lines of Go (which is fast, but not that fast), or probably >1m lines of C++. Engineers don't tend to produce that much more or less in terms of line count, so this would likely take ~4-6x the time to create (very rough ballpark). Plus, with a codebase so much larger there's a greater need for tooling support, maintenance becomes harder, more engineers are needed, etc.

When speed is the winning factor, like it sometimes is for a SaaS API that does something important in a hot code path (e.g. Algolia) then this is all worth it. When you're a consumer product where reacting to consumer demand is the most important thing, the speed difference really isn't worth it.

Most of the times I've profiled a web application I found that slow requests were coming from slow database queries.
So two examples off the top of my head where it’s the request latency and not python at fault

1. In an incident database, we allow full text search with filtering. Depending on the complexity of the query, and contents of the database this can take 10ms or 10,000ms. This isn’t something easily changed. It’s Lucerne's fault.

2. Querying the physical status of a remote site has variable latency because the sensors are on Wifi and it’s flakey. We can’t easily move the sensors, or make wifi coverage in some warehouse perfect.

Right now, we circuit break and route potentially slow requests to their own cluster via the router, but it’s a poor solution.

That depends on what that request is doing. It could be fetching a single record from the database and serializing it, or it could be running a complex analysis.