| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by one_buggy_boi 878 days ago
	Is modern Ceph appropriate for transactional database storage, how is the IO latency? I'd like to move to a cheaper cfs that can compete with systems like Oracle's clustered file system or DBs backed by something like Veritas. Veritas supports multi-petabyte DBs and I haven't seen much outside of it or ocfs that similarly scales with acceptable latency

2 comments

antongribok 878 days ago

Not sure about putting DBs on CephFS directly, but Ceph RBD can definitely run RDBMS workloads.

You need to pay attention to the kind of hardware you use, but you can definitely get Ceph down to 0.5-0.6 ms latency on block workloads doing single thread, single queue, sync 4K writes.

Source, I run Ceph at work doing pretty much this.

link

patrakov 878 days ago

It is important to specify which kind of latency percentile this is. Checking on a customer's cluster (made from 336 SATA SSDs in 15 servers, so not the best one in the world):

  50th percentile = 1.75 ms
  90th percentile = 3.15 ms
  99th percentile = 9.54 ms

That's with 700 MB/s of reads and 200 MB/s of writes, or approximately 7000 reads IOPS and 9000 writes IOPS.

link

louwrentius 878 days ago

These numbers may be good enough for your use case but from what’s possible with SSDs these numbers aren’t great. Please note, I mean well. Still a cool setup.

I’d like to see much more latency consistency and 99th even sub ms. Might want to set a latency target with fio and see what kind of load is possible until 99 hits 1ms.

However, I can say all of this but it’s all about context and depending on workload your figures may be totally fine.

link

samcat116 878 days ago

Latency is quite poor, I wouldn't recommend running high performance database loads there.

link

louwrentius 878 days ago

From my dated experience, Ceph is absolutely amazing but latency is indeed a relative weak spot.

Everything has a trade-off and for Ceph you get a ton of capability but latency is such a trade-off. Databases - depending on requirements - may be better off on regular NVMe and not on Ceph.

link

yencabulator 877 days ago

It's pretty unfair to compare latency of a local NVMe SSD to over-the-network 3x replicated storage. "It's faster if I do less."

[Disclaimer: ex-Inktank employee]

link

e12e 877 days ago

No, it's important when planning - eg: one big database cluster that provides db-as-a-service (but maybe needs some dedicated ops resources) vs smaller DBs with virtualized storage on ceph (ops resources for ceph cluster and vm tools like k8s).

If the latter is too slow for your typical usage...

link

yencabulator 877 days ago

Oh, don't get me wrong, you will pay a price for disaggregated highly available storage, and you might need to evaluate whether you want to pay that price or not. But those are two very different worlds, and only one of them gives you elastic disk size, replication, scale-out throughput, and so on.

GP makes Ceph sounds worse than it is, when reality is that just shoving all your reads & writes over the network, writes multiple times because of replication, is gonna cost you no matter what tech you build that with.

link

louwrentius 877 days ago

I don’t think it’s unfair, there are applications that still are ok with Ceph latencies: I bet it’s good enough for a ton of things.

But not all things.

link