Hacker News new | ask | show | jobs
by commandlinefan 1859 days ago
> extremely unfortunate implementation (I probably spent more time in the Hadoop codebase

Well in fairness, have you ever seen the S3 codebase? I mean honestly it could be a fork of HDFS for all we know.

1 comments

I used to work for Amazon. The code quality at places like Google and Amazon tend to be good.

S3 has a really good architecture and a great implementation.

HDFS has a meh architecture with a bad implementation.

There were obvious signs. I remember when Twitter decided to investigate why HDFS was slow and they figured out some details about how Hadoop guys decided to implement their own dictionary for configuration that had a much worse time complexity than the default dictionary in Java. There might be a video about this somewhere.

And there are more things like that. I used to have 5-10 years old HDFS Jira tickets open. I just gave up.

Here is a video:

https://www.youtube.com/watch?v=jupArYWxoq0

Hadoop is full of these things.

One more thing:

https://lamport.azurewebsites.net/tla/formal-methods-amazon....

I would love to see similar approach to Hadoop.