Hacker News new | ask | show | jobs
by viana007 3692 days ago
The biggest problem with Piwik is that is not scalable and the cost ($) with servers to store analytics data. I saw many cases that Piwik not supported the big volume of data.
2 comments

Piwik is scalable up to at least 1 billion actions per month. Piwik is scalable! But it is not cheap, as one needs powerful database servers with a lot of RAM and fast SSD disks. It can be costly, but Piwik scales!
Why isn't it scalable?
MySQL
It can scale pretty far before it becomes an issue. Years ago I ran it on our main DB for sites that got over 1 million visits a month with no noticeable overhead. If I had it on its own server with its own DB it could have handled far more traffic.
I see. What is the upper bound of "big volume of data" possible under MySQL + Piwik?
That's a difficult thing to answer. However the more important problem is loss of data whenever your DB isn't available due to downtime, upgrade etc. It depends how important data loss is for your user case. I'm a data completist but I'm in therapy for it ;)

It's definitely worth playing with, and trivially easy to spin up. Other self-hosted options aren't anything like as simple to get up and running.

Well, but the point was MySQL + Piwik "does not scale" and that it's "expensive" besides, which doesn't comport with my experience and sounds like received wisdom.
I think if your database is down you've typically got bigger problems than your analytics.
I've worked on Piwik servers tracking & processing reports for up to 800 million pageview per month on hundreds of medium and larger websites.
Put Rabbit or Kafka in front of MySQL then?
They have a Redis-based solution to sit in front of mySQL (seems to be a "Pro" feature though):

https://piwik.org/faq/how-to/faq_19738/

It's not a Pro feature as in you have to pay for it. It's developed by the devs at piwik.pro, but you can download it for free.