Hacker News new | ask | show | jobs
by darkxanthos 4711 days ago
If you have the cash and analytics aren't a real distinct business advantage just go Mixpanel.

If you decide to do it yourself this is what I've done: Create a small web service that you can call to log data from the UI. Start with one server and if it starts to go over 60-80% usage consistently create a second.

The server should log every call to the service in a large flat file (csv is easiest). The file should be named by date and time down to the minute. As you scale up servers you just have a process pull down each file and aggregate them server side. Or just throw them into S3 and use Hive/EMR to report on the data.

It's a middle-class man's Mixpanel. I served tens of millions of logging events a day with this solution. At the time the cost was somewhere around $1,500 a month I believe. I was running 6 servers on Ruby/Sinatra though and never tried to optimize much.

EDIT: typo

1 comments

I guess the part Im really keen to setup is the S3 + Hive/EMR part. But it sounds like it will be a bit too expensive to have up-to-date stats, and better to just do batch processing to run reports.