Hacker News new | ask | show | jobs
by jzelinskie 1034 days ago
Full disclosure: I'm a maintainer of SpiceDB, the most mature open source project inspired by Zanzibar

For this exact use case, SpiceDB created two APIs not available in Zanzibar: LookupSubjects and LookupResources. For other scenarios, there's also a BulkCheck API to performing many checks with less request overhead. The sibling comment here is correct that there isn't filtering/sorting available in SpiceDB yet.

Additionally, there are folks using SpiceDB today by replicating denormalized checks back into their database (e.g. Postgres) or search index (e.g. Elastic) so that you can filter them natively. This is the combination of the aforementioned Lookup APIs with our Watch API. While this strategy requires moving parts, it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over.

While I'm biased, I do find this article somewhat misleading when describing Zanzibar-inspired systems; it presents opinion without any evidence or examples to justify the claim and concludes it as fact, but that might be because they're leaning on their previous article. Zanzibar is novel because it is fundamentally designed to be ran at the edge and solves the difficult problem of keeping the view of data at the edge consistent. This article conveniently leaves out how other systems get data to the edge while still keeping it consistent for their authorization logic. Latency is also brought up, but we recently managed to scale SpiceDB to >1M requests per second with 100B relationships while maintaining a 5ms p95 measured at the client application[0]. The claim that you absolutely need a service to run a Zanzibar system is a provably false claim based on the number of clusters in the wild running SpiceDB or Ory's Keto project.

[0]: https://authzed.com/blog/google-scale-authorization

2 comments

> Additionally, there are folks using SpiceDB today by replicating denormalized checks back into their database (e.g. Postgres) or search index (e.g. Elastic) so that you can filter them natively. This is the combination of the aforementioned Lookup APIs with our Watch API. While this strategy requires moving parts, it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over.

Would you say that because of this, Zanzibar engines like spicedb only become useful on systems of a certain size / complexity? Fundamentally you run into data synchronization issues whether you are syncing denormalized data back to your db via Watch or whether you write the relationships to both data stores in the first place. This article[0] on the latter topic touches on this, but brushes over some tricker parts of implementing such a thing correctly (eg. 2 writes section only covers insert not update or delete which is generally less harmful to have a ghost update that persists in spicedb, streaming updates brushes over some major footguns).

Granted there's nothing unique to spicedb in this sort of complexity, but by nature of being a db, using spicedb mandates that users must take on the complexity.

Is it then fair to say that it is appropriate to use spicedb once a project reaches a certain size / complexity, or would you expect a startup to adopt it from the beginning?

[0] https://authzed.com/blog/writing-relationships-to-spicedb

>Fundamentally you run into data synchronization issues whether you are syncing denormalized data back to your db via Watch

The Watch and Lookup APIs emit revisions so that any replicated data can include revisions to guarantee consistency. The linked article covers replicating data into SpiceDB and not the other way around; this is generally done for brown-field projects and does come with consistency trade-offs.

It's true that this complexity isn't unique to SpiceDB. The important part is that SpiceDB makes this _possible_ because if you architect a solution where it isn't, you'll find one day you've backed yourself into a corner.

>Is it then fair to say that it is appropriate to use spicedb once a project reaches a certain size / complexity, or would you expect a startup to adopt it from the beginning?

I briefly touch on this subject a bit in this post[0]. Unfortunately, there's no dead simple answer. We do have customers that are startups in various stages, but they all deeply considered the implications of focusing on authorization before they jumped in. IME, startups really need to find product market fit first. Build your MVP using whatever it takes and and only move on to thinking about authorization when it becomes critical. When is it critical, but not too late? I think that's once you start noticing that each PR implementing a feature request is also touching authorization code/SQL. There are also other big signals: microservices architecture or enterprise customers are almost certain indicators that your authorization logic isn't going to remain a small library in your monolith.

[0]: https://authzed.com/blog/authz-must-scale

Jimmy I truly think you're awesome (And so is SpiceDB), but the irony here stands out: "it presents opinion without any evidence or examples to justify the claim and concludes it as fact"

You mean stuff like: 1) "SpiceDB, the most mature open source project inspired by Zanzibar" (though I'd vouch for that one) 2) " it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over." 3) "Zanzibar is novel because it is fundamentally designed to be ran at the edge" 4) "we recently managed to scale SpiceDB to >1M requests per second with 100B relationships while maintaining a 5ms p95 measured at the client application" - you should bundle that statement with you need to set it up within your own VPC for it to be fair. 5) "The claim that you absolutely need a service to run a Zanzibar system is a provably false claim based on the number of clusters in the wild running SpiceDB or Ory's Keto project" - how many clusters? :)

Re: "This article conveniently leaves out how other systems get data to the edge while still keeping it consistent for their authorization logic" The article actually does mention OPAL [0]

[0]: https://www.permit.io/blog/introduction-to-opal

Your critique of my comment is quite fair; we're both guilty of making claims, but not including all the supporting evidence for brevity's sake. I think we can both agree that everyone working in this space is doing awesome work and bringing authorization the attention that it's sorely needed.
Agree 100%. <3 And as I told Joey many times - I'd love to collaborate more with you as well.