Hacker News new | ask | show | jobs
by kozyraki 4081 days ago
All of us who have used Borg over the years are very appreciative of the technology and its capabilities. Congrats to all the Googlers, current and ex, that contributed to it. This paper should be toward the top of the reading list for anyone working on the topic.

Nevertheless, there are many open questions for large-scale cluster management for researchers and developer to address. Here are some of my favorite: - The curse of overprovisioning: Borg and many other systems rely on reservation which are systematically exaggerated by users. Right sizing these reservations is one way to go beyond the 40-50% usage shown in the Borg paper (see fig 12). A promising way of doing this is Christina Delimitrou's work using classification techniques (see http://goo.gl/vFf8oN) - Oversubscription using better isolation mechanisms): this is what the Borg paper calls resource reclamation. Take unused (but reserved) resources from priority jobs and use them for best effort analytics. David Lo (http://web.stanford.edu/~davidlo/) has a very interesting paper coming up on how to coordinate cpu sets, cache partitioning, Linux TC, RAPL/DVFS (power management) to run websearch clusters at >90% by packing them with analytics without causing ANY glitch on search. And that is Google search.

There are definitely more interesting. Exciting times.

1 comments

> http://goo.gl/vFf8oN

goes to: http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf

It's my understanding that URL shorteners are frowned upon in HN posts or comments.

I learn something every day :)