| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by meritt 4425 days ago
	I've looked at and avoided doing anything serious with hdfs/mr for 6 years now. I'm glad some people are starting to realize that re-processing your entire dataset every single time you want to do something isn't very efficient. I'm still waiting for lightbulb moment where the usefulness of it really makes sense to me. Can anyone point me to a book or blog that discusses good uses of hadoop/map-reduce?

2 comments

jgrahamc 4425 days ago

I'm waiting for the day people realize that materialized views in databases are awesome and decide to incorporate them into a framework.

link

meritt 4425 days ago

At least if you're using Oracle they are, as it supports auto-refreshing. Postgres has only had them since 9.3 (and have to be manually refreshed). Meanwhile MySQL is still struggling with regular views.

link

zenjzen 4424 days ago

Simplistically speaking, you don't always have to do table scans. I run into this every day: "Let's use Hadoop and keep doing full table scans! It's scalable! We just add more machines!" Yeah, except continuing to scan all of your growing data each time you need it is inherently unscalable. :(

link