Hacker News new | ask | show | jobs
by truth_ 1850 days ago
Not a DB expert.

Why write a DB program in Python? Wouldn't it be slow?

5 comments

You get the legendary data integrity of mongoDB with the refinement of brand new software running at the speed of python. What's not to like?
People write raft papers with pseudo code or some hard to comprehend systems language. I've always wondered why there isn't a Jepsen tested python implementation with lots of github stars.

Here is mine:

https://github.com/adsharma/raft

Waiting for a python correctness prover and a transpiler.

Yep, it's slow. :)

It's not meant to be used in production that has scaled up. This project was mostly for fun.

I would use it in production. It just depends on what I'm producing. "Production" can range from everything like a banking website serving millions (nope) to a kids toy (yup).

There are a ton of uses where performance doesn't matter, and some where data is even ephemeral or non-critical. These sorts of simple tools are also really nice for test cases, development environments, and a ton of other uses.

I'm developing a tool designed for large-scale data processing, and I have a dummy back-end very similar to this which I use for development.

My computer has close to 4GHz and multiple cores. Essentially anything which ran on an 80486 back in the day will be fast enough in interpreted Python. That's actually a lot of stuff.

And yes, I'm not disagreeing, but agreeing and expanding on your easily-missed disclaimer ("production *that has scaled up*").

That’s exactly what I meant. :) And thanks for sharing !
I did the same to learn Redis. I wrote a simple graph database using Gremlin's syntax https://github.com/emehrkay/rgp

It was very slow, but interesting.

Interested to know if it can run on PyPy and what kind of difference that would make.
Thanks. That makes sense.
If I had the chance to write a software whose description used the words Monty and Python, I would do that no matter if the result would be as slow as a dead parrot.
Depends on what you're doing, I guess. If you're just working with Python dictionaries with occasional background flushes to disk then it would be very fast. Probably close to as fast as anything else. Of course, there's a lot more to a DBMS than just reading/writing in-memory data structures and occasionally saving them on your hard drive.
I wrote a python to rust transpiler (py2many) also as a fun project. I won't be surprised if writing a db in python actually becomes viable some years down the road due to the awesome tooling and the idea -> code uninterrupted flow that's possible.
You may want to write that, but why would someone want to use a database that could be 50x faster if it was written in a native language? If you write software for yourself in a slow scripting language, that is one thing, but it leaves a huge amount of performance on the table.

This is the same as the problem with electron. People that only know javascript might think it is great to embed a full web browser, but it is selfish to users to push something that they think will run at a normal speed only to have it use 100x the memory and lag during simple operations.

The transpiler could in theory generate code in the native language. You can see for yourself:

https://news.ycombinator.com/item?id=27032399

Please file bugs/issues if something isn't working.

Look up Nubank and Datomic. You'll have an aneurysm.

(Edit: to be clear, I'm agreeing with you)

> I won't be surprised if writing a db in python actually becomes viable some years down the road due to the awesome tooling and the idea

Long term, complicated, high performance projects worked on by many developers is the Achilles heel of Python. The lack of type safety really bites over a large code base. Also issues with automatic refactoring tools due to the very dynamic nature of Python Deployment and dependency management is also a big issue in Python. Not to mention performance and multithreading.

Yeah I had an API server to write. I looked at FastAPI and checked out the example project. So much tooling for formatting, type hinting, linting, deployment, etc. And while the project claims to be "comparable in speed to Go" the benchmarks they linked to showed that meant "significantly slower than". In the end I just went with go instead. Python has it's place but you can avoid a lot of work by using something else sometimes.
Benchmarks like that are completely artificial anyway, because the real speed difference comes when the code becomes more complex and the dynamic language can no longer be reduced to something like those simple benchmarks, because it's not provable.

And God forbid someone mention the L1 cache and how "benchmarks" are completely different to the cache interactions in real-world dynamic programs.

Python has a PR problem:

* That it's a dynamically typed language * That it's not a serious language like C or C++, suitable for writing a 50 line throwaway script

I'd like to convince people that both statements are false. But probably best to use the github issue tracker than HN comments.