Hacker News new | ask | show | jobs
by bijanv 4597 days ago
Taking dumps of analytics logs and pulling out relevant info for our customers on app usage
1 comments

This is the `grep/awk` use case. The nice thing about streaming mr interface to hadoop (calling external programs) is that you can literally take your grep/awk workflow and move it to the cluster. Retaining line oriented records is a huge step in having a portable data processing workflow.