Hacker News new | ask | show | jobs
by bogomipz 2394 days ago
Does Rook give you the equivalent of EBS root volumes for your nodes then? Is that the function you have it providing? Does it offer something beyond using local host storage and minio?

I ask because I've generally been confused about the use case for Rook despite having read the "what is Rook?" paragraph many times on the project home page. My assumption is that it lets you build your own internal cloud provider. Is that correct?

2 comments

https://rook.io/docs/rook/v1.1/

Rook will give you:

- Ceph: Block Storage, Shared storage, Object storage

- EdgeFS: Object but apparently will function as Block and File

- Minio: Object

- NFS4: Shared

- Cockroach: Database

- Cassandra: Database

- YugabyteDB: distributed SQL database (new to me)

Only the first two are marked as stable.

I'll mention that they're all independent projects under rook's umbrella.
Not directly, rook provides a storage backend for in-cluster workload persistent volumes.

In theory you could manage kvm machines with kubevirt and back the machines with PVCs from rook. I have not tried this and would be curious how the performance is.

I would say they're functionality quite equivalent, you just wouldn't pull them in as a root volume.

You should be able to point your default StorageClass at Ceph and have it create RBD block storage devices for you, which would auto create pvc, that you mount you actual data in in your kube manifests.

The parent comment was asking about root volumes for kubernetes nodes

> Does Rook give you the equivalent of EBS root volumes for your nodes then?

I think we are saying the same thing if i'm not mistaken?

I doubt the parent really cares about whether or not it's a root volume, but for the record you can mount it to root.

It'll just essentially be empty, which means you have to add a step of populating root with something useful. You'd have to either try and use an initContainer to copy in a filesystem at runtime, have ceph give you a pre-populated directory, probably via thin provisioning, or do something out of band (I've did this in k8s).

This is usually more effort than it's worth though, as container runtimes populate the root for you with whatever you want anyway. Plus, if you start treating containers as blobs like VMs, you'll end up in the situation where you don't know where your important variable data is, which leads to situations where people forget to back it up and test it.

I only said "root volumes" as it's a common use case for EBS volumes. For instance with etcd running in a container you would want the host volume to be an EBS volume since it's critical it's a critical K8S component.
Right, but you would only need to mount the etcd data partition (/var/lib/etcd afaik), rather than the entire node and/or container.

The main problem you have here, is chicken and egg. How do you use a StorageClass, kubernetes PVC, or rook to provision Block Storage for etcd, when you need etcd for kubernetes to function, and you need kubernetes to function for rook, et all.

At some point, you need to bootstrap the world, which is people either start off with cloud APIs, ansible, or PXE.

."The main problem you have here, is chicken and egg. How do you use a StorageClass, kubernetes PVC, or rook to provision Block Storage for etcd, when you need etcd for kubernetes to function, and you need kubernetes to function for rook, et all."

I totally agree. The EBS lifecycle management is generally handled by something like Terraform. That's why I was wondering if the use case for Rook is primarily bare-metal Kubernetes since AWS/GCP et al. already provide these. So I'm wondering that even in a bare-metal environment where you still need to use config management tools like Ansible/Terraform to do things like provision block storage what's the upside of Rook over existing iscsi/Ceph/minio installations?

>"Not directly, rook provides a storage backend for in-cluster workload persistent volumes."

Right so is it fair to say that the use case for Rook is if you are running Kubernetes on bare metal? For instance if I'm Kubernetes cluster on AWS then AWS already provides PVC via EBS and S3 volumes. Or am I overlooking a use case where you would run Rook on cluster running on AWS/GCP?

Yes, rook is great for bare metal and would enable dynamically provisioned persistent volumes.

Rook runs a ceph cluster inside your kubernetes cluster to provide the storage. The downside is this consumes cpu/memory (and obviously storage) resources to run, whereas on AWS, EBS is integrated into the platform so it does not "run" inside your cluster (other than the aws cloud-provider that provides the integration).

If you wanted to run the same storage backend on bare metal and AWS, rook would enable that.