Hacker News new | ask | show | jobs
by hbrundage 2205 days ago
Prisma's architecture seems novel and ... a little strange to me. It works by running a Rust engine as a subprocess and then communicating with the engine from JS land over a non-spec compliant GraphQL API. The engine holds the actual databae connection pool and does all the SQL generation and data marshalling. See https://www.prisma.io/docs/reference/tools-and-interfaces/pr... for more info on this arrangement.

It has some weird ramifications though:

- when they go to implement a new feature (like recently added JSON column support) they have to implement it on both sides which can cause bugs like this: https://github.com/prisma/prisma/issues/2432

- they're a little limited to the semantics of GraphQL based RPC, which namely excludes any stateful stuff like arbitrary START TRANSACTION; blocks that might or might not commit. See https://github.com/prisma/prisma-client-js/issues/349 for more info on that

- they don't run everywhere JavaScript runs like the browser or Cloudflare Workers (unless there's something fancy that compiles the engine to WASM I'm not aware of)

I wonder if their intention is to re-use the engine between different JS processes for caching / sharding or something like that, or to add Prisma clients in other languages. Why create the indirection?

I do like Prisma's type safety compared to the pure TypeScript alternatives like TypeORM and MikroORM -- it's really good at typing the results of specific queries and preventing you from accessing stuff that wasn't explicitly loaded. The style of the query language is the cleanest I've seen out of the three as well IMO.

Edit: I think node modules can install arbitrary binaries to some serverless JS runtimes, not sure specifically about Cloudflare but I know their dev tool bundles JS using webpack, which would exclude other binaries from node_modules.

5 comments

A few things to note:

- Prisma 1 was a completely independent server and Prisma 2 was most likely started as a rewrite of Prisma 1 so it followed the same approach

- This indirection will be removed if someone can finally land a Rust binding to NAPI (looking at you Neon binding people)

- Prisma plans to support multiple languages thus it makes sense to have an agnostic engine

- This not far from having a PG engine coded in C and interfacing with like like most libraries do anyway, javascript is just too slow for this kind of stuff

I generate Typescript types from my database ([link redacted]) which gives me type safety on back- and frontend without relying on an ORM. I am curious about Prisma but I don't see any advantage to it from my quick skimming.
I do the same using kanel. It's just enough to make the typings smooth without dictating anything else about how they are used. I prefer to write the queries directly in sql using pg-promise and then type the results of the query and the parameters of the query using the output of kanel. Any changes to the db result in generating new typings followed by running the tests to make sure nothing broke.
What do you use to write the queries? Some query builder?
I use a home made library on top of Knex that is not open source yet. I will extract it from the Submotion code and release it as well, but I am not sure when.
knex seems to be what everyone use, has a migration tool too
Thank you for writing this up hbrundage - that's a pretty good summary.

I'm the co-founder of Prisma, so should be able to answer some of your questions :-)

Prisma has a Query Engine written in Rust, and a language binding for each target language. Currently we only support JavaScript and TypeScript, but a binding for Go is already in the works. As Sytten alluded to, this split allow us to write and test all the logic once, and have a relatively thin layer that is only concerned about presenting an ergonomic API following the idioms specific to a given language. Now, as you mention, this introduces a bit of extra work on our part, and the potential for bugs when the two sides don't add up. But this problem is very minimal in practice, and in fact most features can be implemented with just a change on the Rust side, as the language bindings are generated based on a API description emitted by the Rust binary.

Another reason for the split is performance. It's reasonable to ask how performant a library that simply marshals some data from a database really has to be. But it is important to realise that Prisma Client is quite a bit more ambitious than that. Where other libraries usually tries to generate a single complex query, Prisma will often issue multiple smaller queries and partly join the data in memory. The throughput difference between V8 and Rust is significant here.

You are right that our architecture precludes us from doing things like explicitly starting a transaction and keeping it open for a longer duration of time. Our long-term goal is to create an Application Data Platform for medium-sized software development teams that can't afford to invest in internal infrastructure to the same degree as big tech companies. If you are curious what this might look like, you can take a look at TAO at Facebook or Strato at Twitter. For long running transactions specifically, we believe that they are often misused by developers who think they get a certain guarantee that they don't actually get from wrapping their workload in a transaction. There are often better approaches - both more correct, and easier to reason about, and that's what we want to teach people.

Currently we are building 30+ different binaries for each release in order to support most sensible platforms. This is a pain for us, but I hope most of our users will see that this is something they rarely have to worry about if at all. We believe that WASM + WASI will enable us to eventually remove the need for the binary for Applications running on Node, but the ecosystem is not quite there yet.

Ultimately, I think the biggest step forward represented by Prisma 2 is the type safety and result typing. We have been pushing the TS compiler to its limits, and I believe the developer experience speaks for itself. We have a lot of work to do in order to build out the feature set, but I hope many developers will appreciate the improved ergonomics, and trust that we will work diligently over the coming months to add the features that they need.

Thank you for looking into Prisma!

Ah, that explains a lot, thanks for the breakdown. And yeah, Prisma 2's type safety is stellar and in a league of it's own.

With respect to building out a big-boy operational datastore -- I think that's really cool. It'd be nice for me to be able to use something like TAO or EVCache or what have you without having to build it all myself, that's for sure. I understand why Prisma's API is constrained compared to a regular relational database in order to support those needs. That said, I think that the very best (and certainly most sell-able) Application Data Platform doesn't require adopters to drop key abilities or semantics they are used to in order to switch away from a normal database. I think those semantics only need to be dropped at the kind of scale which very few Prisma users are ever going to reach, yet they pay the productivity penalty for those missing semantics from the very first moment they begin using the tool.

Yes, you can do a lot of the same things you might want to do with transactions with nested or batch operations, but, not everything. For example, Rails' transactional testing feature is battle tested and seemingly well loved by the community, and currently impossible with Prisma. Instead, you must use a slower and more error-prone database cleaner tool. Another example would be a bank style database with double entry accounting. You want to decrement one account by a certain amount and increment another account by a certain amount transactionally, but only if the from account has a total greater than the certain amount. `SELECT FOR UPDATE` to the rescue in Postgres, but negative account balances with Prisma.

Teaching developers to not hold transactions open for a long time, or to use smart, efficiently implemented nested inserts is a good thing without a doubt, but you could still do that education while preserving transaction semantics. Devs have been used to having those since the 70s. The two aren't in conflict if you ask me. It would make your life harder, that's for sure, but it would make my life as a potential user easier, and remove one argument for not switching over.

AFAIK Cloudflare workers run JS/WASM in a V8 isolate, don't have Node.js APIs, and block eval()/new Function.
I don't see how useful it would be to query your database from the browser or cloudflare workers, in both cases you want the database access closer to your database to reduce the RTT. And you definitely don't want to give the browser direct access to your database - even if that were possible.

The implementation choice seems odd to me, nonetheless.

You're totally right about connecting to your database from the edge. That said, I think JS is quickly becoming the target for a lot of hosted runtimes because it is so easy to sandbox and has the option to drop down to WASM for high performance and indirectly supporting other languages. Cloudflare Workers (and Fastly, and Superfly, and and and) are all following that path at the edge, but I think as the consensus builds around JS + WASM as a server runtime, we might see the same style of environment for more traditional workloads that might wanna connect to databases.
> You're totally right about connecting to your database from the edge.

You might want to from your container on the edge, and separating off the work to another process makes having single-process lightweight containers more difficult (do you have multi-process container? do you sidecar the workers? etc).

So yes, I too found the architecture a bit odd. I have also seen it in https://mediasoup.org where it makes more sense to use more native workers, but carries same multi-process challenges.

That's true, and somewhat ironic as I find myself building just such an environment because NodeJS is nearly impossible to sandbox securely, but V8 is built for that use case. Prisma (or anything else with a native dependency) would not run in that environment.
The RTT is the same for the client either way isn't it? Is the concern that you're wasting DB resources instead of intermediate resources, ie the real concern is extended db connection time?
Yes, with transactions the big concern is transaction time which is often dominated by the RTT.

With performance in general you want your chatty transacting code as close to the database as possible and to merely invoke it from afar. Then you have many very short RTT from the transaction, plus one long RTT before and after. Stored procedures or functions are actually the optimal here.