Hacker News new | ask | show | jobs
by doanbactam 154 days ago
We switched our main API from Postgres to Turso last month and haven't looked back. The automatic schema migrations are a nice touch, but I wish the documentation on vector embeddings was a bit more robust. It's wild how much of the modern web is moving back to file-based databases. We switched our main API from Postgres to Turso last month and the cold start times are basically zero now. Are there any plans to support vector columns soon, or is that strictly off-roadmap for now?
3 comments

How do you scale your front end horizontally with a file based database? Do you put the files on a shared file system that the app layer all mounts and lock when writing? Or do you shard and route by user? Or do you build a big vertically scaled API server with the database in it?
There's a few options... they could be using separate database instances per client to isolate workloads... if there are fewer than 10k or so users at each client, it's pretty doable without a lot of effort, you can further isolate types of data into separate databases as well... other operations can rely on heavy caching, such as maintaining the list of dbs for each client, etc.

You CAN use a single database instance or file for everything, you can also use multiples to scale without falling strictly into a heavy vertical or horizontally scaleable database system. Especially if those db instances are distributed over a network channel to different physical servers on the back-end (as is with Turso).

I've worked on a lot of systems where I had advocated the use of separate DBs either on a oer-client or per-project basis... Even from a single management server, you can do a lot... from a small cluster operating against SQLite databases you can do more. With Turso's efforts to improve concurrency, it can go further still.

I'm not an employee, related to, or even an active customer right now... but I do understand the model and how it can work in a lot of use cases.

If each client has its own database, does that mean you're automating the infra to turn on a new instance for each database, and then routing requests appropriately? Not too hard with k8, I suppose. Or if you add new clients infrequently enough it's a manual task. But you wouldn't do this for each user - it assumes you have some higher level of organization (client/tenant/org/etc).

Sounds like these are different ways of manually sharding the database?

Why would you need a different service instance to connect to a different database file?

And yeah, it's effectively sharing data... My point is there are lots of use cases that can fit in this type of database usage.

You only run a single instance of your API? How many monthly customers do you have
You had a cold start issue with Postgres? Were you running a serverless postgres?