Hacker News new | ask | show | jobs
by thayne 235 days ago
Using postgres would make it significantly more complicated for Jellyfin users to install and set up Jellyfin. And then users would need to worry about migrating the databases when PostgreSQL has a major version upgrade. An embedded database like sqlite is a much better fit for something like Jellyfin.
1 comments

As a Jellyfin user, this hasn’t been my experience. I needed to do a fair bit of work to make sure Jellyfin could access its database no matter which node it was scheduled onto and that no more than one instance ever accessed the database at the same time. Jellyfin by far required more work to setup maintainably than any of the other applications I run, and it is also easily the least reliable application. This isn’t all down to SQLite, but it’s all down to a similar set of assumptions (exactly one application instance interacting with state over a filesystem interface).
Jellyfin isn’t meant to be some highly available distributed system, so of course this happens when you try to operate it like one. The typical user is not someone trying to run it via K8s.
Yeah, I agree, though making software that can run in a distributed configuration is a matter of following a few basic principles, and would be far less work than what the developers have spent chasing down trying to make SQLite work for their application.

The effort required to put an application on Kubernetes is a pretty good indicator of software quality. In other words, I can have a pretty good idea about how difficult a software is to maintain in a single-instance configuration by trying to port it to Kubernetes.

Most of the issues with the database are old sins from Emby. With 10.11 the Jellyfin team finally managed to clean up that mess so they can move forward with a clean implementation. Their blog post on moving to EFCore [1] and version 10.11 release post [2] have more details.

[1] https://jellyfin.org/posts/efcore-refactoring [2] https://jellyfin.org/posts/jellyfin-release-10.11.0/

Yes, I agree. I’ve been eagerly awaiting this change for well over a year now.
Is running multiple nodes a typical way to run Jellyfin through? I would expect that most Jellyfin users only run a single instance at a time.
Yes, but you have to go out of your way when writing software to make it so the software can only run on one node at a time. Or rather, well-architected software should require minimal, isolated edits to run in a distributed configuration (for example, replacing SQLite with a distributed SQLite).
That's just not true. Distributed software is much more complicated and difficult than non-distributed software. Distributed systems have many failure modes that you don't have to worry about in non-distributed systems.

Now maybe you could have an abstraction layer over your storage layer that supports multiple data stores, including a distributed one. But that comes with tradeoffs, like being limited to the least common denominator of features of the data stores, and having to implement the abstraction layer for multiple data stores.

I’m a distributed systems architect. I design, build, and operate distributed systems.

> Distributed systems have many failure modes that you don't have to worry about in non-distributed systems.

Yes, but as previously mentioned, those failure modes are handled by abiding a few simple principles. It’s also worth noting that multiprocess or multithreaded software have many of the same failure modes, including the one discussed in this post. Architecting systems as though they are distributed largely takes care of those failure modes as well, making even single-node software like Jellyfin more robust.

> Now maybe you could have an abstraction layer over your storage layer that supports multiple data stores, including a distributed one. But that comes with tradeoffs, like being limited to the least common denominator of features of the data stores, and having to implement the abstraction layer for multiple data stores.

Generally I just target storage interfaces that can be easily distributed—things like Postgres (or maybe dqlite?) for SQL databases or an object storage API instead of a filesystem API. If you build a system like it could be distributed one day, you’ll end up with a simpler, more modular system even if you never scale to more than one node (maybe you just want to take advantage of parallelism on your single node, as was the case in this blog post).

> just target storage interfaces that can be easily distributed—things like Postgres

But as I mentioned above, that makes the system more complicated for people who don't need it to be distributed.

Setting up separate db software, configuring the connection, handling separate updates, etc. is a lot more work for most users than Jellyfin just using a local embedded sqlite database. And it would probably make the application code more complicated as well.

Jellyfin isn't a Netflix replacement, it's a desktop application that's a web app by necessity. Treat it like a desktop app and you won't have these issues.
They have clients for nearly every device; it’s clearly intended to be a streaming media server.
It's a local media library manager in the same vein as media servers that came before it that were intended to run on desktops and serve up content to consoles and whatever on your LAN back when that was the thing to do.

My point is to treat it like software from that lineage and you won't have a problem, trying to treat it like something it's not, like a distributed web app, will lead to issues.

It feels like we’re saying similar things. We both agree that its architecture makes it difficult to run with high availability, although I’ll point out that the issues documented in the article apply to single nodes and even on a single node it has pretty specific hardware requirements. I think we just disagree about whether “you have to hold it very carefully and it works just fine” is a good thing or not.
Care to share your setup?
Presently I’m running my media directory and sqlite database on NFS (one big single-point-of-failure). My Kubernetes Deployment resource is configured to use the “Replace” rollout strategy (at least I think that’s what it’s called if i’m not misremembering) so there are never two concurrent instances. This means I take downtime during rollouts, but it’s fine for my use case.

One of the more difficult bits (which is not really Jellyfin’s fault) is that the application must run on nodes with access to an adequate GPU to handle any on demand transcode tasks, which requires making the GPU available to them Kubernetes pod and also telling the scheduler which nodes have a GPU and which do not. For that I used node feature discovery along with some intel specific plugin for the GPU (my GPU was an integrated intel GPU).