Hacker News new | ask | show | jobs
by gargantian 4201 days ago
I'm not seeing what this has to do with Docker, other than the planned future ability to run M/R jobs inside Docker containers. That's an interesting idea, but I don't think enforcing Docker has any advantages; if you have a clean and simple API like REST or even plain pipes, you open yourself up to a whole new world of composability without needing something relatively heavyweight like Docker.

Additionally, you seem to be leaning on CoreOS at the moment. That seems a dangerous dependency considering the CoreOS/Docker relationship.

1 comments

> if you have a clean and simple API like REST or even plain pipes, you open yourself up to a whole new world of composability

Totally agree with this and that's one of the core tenants of our API design. We should probably be more clear about how docker fits with pfs. Our APIs are all designed as RESTful services to allow for composability, however we want to take a batteries included but removable approach. In our case Docker is a battery. We want it to be there so that users have a really easy primitive to implement M.R jobs with. But we recognize it might not be for every user so we also want to allow people to put anything they want there. I think the easiest way would be just letting people pass an arbitrary endpoint to be used in an M/R job.

> Additionally, you seem to be leaning on CoreOS at the moment. That seems a dangerous dependency considering the CoreOS/Docker relationship. I'm hopeful that both of these companies commitment to a batteries included but removable approach will make leveraging both ecosystems a realistic option. I agree that it would be a pain to have to pick one.

> In our case Docker is a battery.

Am I correct in assuming this means I can (eventually) use PFS without any dependency on Docker? In particular, I'm interested in knowing whether I can expect to be able to run PFS contained within an arbitrary unprivileged container and use my preferred orchestration around it. Or, is it the goal of PFS to take over the orchestration plane, or require CoreOS's? Or, something else?

> I think the easiest way would be just letting people pass an arbitrary endpoint to be used in an M/R job.

I love that idea.

We very much don't want to take over the orchestration plane. We'd much rather interoperate nicely with existing orchestration systems. We just want to add the ability to store and access large datasets within these existing systems.

Your ideal of being able to use and arbitrary unprivileged container and preferred orchestration software is how I feel it should work eventually as well. Unfortunately right now we have to target very specific environments though so we can focus our development efforts so eventually may take a little while.

It's great to hear about these concerns early on so thanks for taking the time to comment. I'll definitely make sure that we hold on to this as a core tenet.