Hacker News new | ask | show | jobs
by manigandham 3182 days ago
Persistent storage remains a complicated problem. Attaching volumes on the fly with docker volume abstraction works well enough for most cloud workloads, whether on-demand or spot, but it's still easy to run into problems.

This is leading to rapid progress in clustered/distributed filesystems and it's even built into the Linux kernel now with OrangeFS [1]. There are also commercial companies like Avere [2] who make filers that run on object storage with sophisticated caching to provide a fast networked but durable filesystem.

Kubernetes is also changing the game with container-native storage. This seems to be the most promising model for the future as K8S can take care of orchestrating all the complexities of replicas and stateful containers while storage is just another container-based service using whatever volumes are available to the nodes underneath. Portworx [3] is the great commercial option today with Rook and OpenEBS [4] catching up quickly.

1. http://www.orangefs.org

2. http://www.averesystems.com/products/products-overview

3. https://portworx.com

4. https://github.com/openebs/openebs

2 comments

Also want to highlight that AWS will now allow spot instances to just be stopped instead of terminated, so only compute power is removed but data is persisted automatically as long as you use EBS root/attached volumes.

https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec...

Using a clustered/distributed filesystem definitively simplifies persisting the state between EC2 spot instances. It also makes it easier to scale out the work load when you need more instances accessing the same data. To add to your list: there is also ObjectiveFS[1] that integrates well with AWS (uses S3 for storage, works with IAM roles, etc) and EC2 spot instances.

[1]. https://objectivefs.com

This looks very interesting, good competition to Avere based on info so far. Is there any native kubernetes integration in the works?
We are looking into the best way to add native kubernetes support. Currently, you can add a mount on the host or directly mount the file system inside the container. Both approaches work well, so it mainly depends on your preferred architecture.
A persistent volume provider would be great: https://kubernetes.io/docs/concepts/storage/persistent-volum...

This makes it easy to declare the volume as part of the deployment and automatically attach storage when the container is run. Mounting on the host isn't very easy (or even possible sometimes), especially with spot/preemptible instances and the increasing abstractions by managed K8S providers. The pricing model might need to be different though if billing on a container-mount level.