Hacker News new | ask | show | jobs
Scylla 3.0 announced – Secondary Indexes, Cassandra 3.0 compatibility and more (scylladb.com)
46 points by mcms 2786 days ago
3 comments

The announcement is a little misleading — they're announcing two different things, not one:

* Scylla 3.0, which adds secondary indexes and materialized views; and

* OLTP and OLAP features, which are not ready.

I'm struggling to find any clear information on what's implied by OLTP, but from the roadmap [1], it looks like they're just adding Cassandra's LWTs, not ACID transactions. Last I heard, you couldn't build ACID on top of Cassandra/Scylla, since row updates across multiple keys cannot be done atomically. Calling this OLTP seems a little misleading.

[1] https://www.scylladb.com/product/technology/scylla-roadmap/

Actually, Scylla 3.0 is still RC [1] and as mentioned in the article will be released later this month.

And I don't think OLTP/OLAP is about LWTs. I think they are just talking about Scylla's ability to process normal updates in a timely manner while doing OLAP queries.

And you're right about ACID transactions. Multiple rows across two or more partitions can not be updated atomically, even with LWTs, but it is possible for a single partition.

[1] https://github.com/scylladb/scylla/releases

But OLTP is online transactional processing. Even with LWT, you arguably just can't do OLTP with Cassandra/Scylla, not without transactions.
OLTP traditionally refers to the type of updates you do during normal use of your application. Frequent, small and incremental updates vs large sweeping reads and aggregations (OLAP)

The T does stand for transactional but in this context it is not referring to transactions in the database sense

[Scylla employee] We’re bootstrapping our LWT implementation now and we do want to take it much further, with multi-partition transactions.
I wad looking into Scylla yesterday so this is really cool timing! Does anyone have any experience with just switching out (Apache) Cassandra with Scylla?
I've managed medium sized Cassandra clusters (~18 nodes) for a couple years (since v1.2). At a new job, last year, I replaced the Cassandra cluster with Scylla. Performance was great, as it says on the in, but as an operator:

- No GC Tuning. It just works out of the box and I don't sleep over requests being dropped because of a multi-second GC.

- Compactions are fast. Compactions in Cassandra generate a lot of garbage (which cause a lot of GC), and end up taking a long time. This becomes worrisome as your largest, most compacted tables tend to be the ones you write a lot too - so performance degrades over time (until you have to shard again).

Cassandra is still very good software though and still a solid choice - it was far more stable than Mongo (pre-WiredTiger, I haven't used mongo past ~2.2), and while the above issues were annoying, they were entirely predictable. Capacity planning was easy, and with Scylla you may find yourself running fewer nodes (maybe DSE prefers Java because they charge per core, jk).

The only thing I've missed in Scylla so far is LWTs, but I have not used that much.

When you did the switch did you just swap out the software on each node, or did you provision new nodes with Scylla, adding them to the cluster, and removing old nodes with Cassandra?
I don't believe you can have a mixed Scylla/Cassandra cluster. In our case we cloned our drives and started a whole new cluster with Scylla. When we were happy with Scylla and done the appropriate testing, we shutdown the Cassandra cluster.
[Scylla employee] Indeed a mixed cluster isn’t supported, since the inter-node protocols are incompatible.
Yes. It's much faster, requires very little maintenance and no tuning. Run repairs every and that's it. It was missing some Cassandra features but looks like 3.0 finally is at parity, according to this press release, so we look forward to upgrading.

EDIT: looks like LWTs are still unavailable https://github.com/scylladb/scylla/issues/1359

I can't directly answer your migration question, but, regarding comparison, our team was looking heavily into Cassandra when someone proposed Scylla. We immediately dove into it and was impressed that their claim of being a drop in for Cassandra, as far as our uses go, is totally true.

We're using it now as we approach Alpha, and love it. Extra bonus - no Zookeeper!

Just wondering how did you replace Zookeeper with Scylla? AFAIK LWTs are not yet available in Scylla
Wow, this team just continues to eat DataStax' lunch.
In which ways?
Cassandra is a great database design but the implementation has never really matured and become a polished product. The datastax distribution was a nice package but suffered from complexity and poor performance. The political battles also hurt by making even more variations of the database which confused users.

Cassandra is even starting to use some of the architectural ideas from Scylla but it's stuck in a messy state so far. Scylla has the opportunity to cleanup and make forward progress with a solid technical foundation.

Datastax has completely dropped the ball with Cassandra. Years of unusable builds because of decisions they couldn't even begin to satisfy or test properly.