Hacker News new | ask | show | jobs
by tabs_masterrace 1953 days ago
For those who have the freedom to decide what to use, I recommend looking into rolling your own. Especially if you're writing back-end services for your app anyway. After Google shutdown Fabric.io, we asked ourselves why even rely on 3rd parties for analytics. All we really wanted to know is basic usage statistics, like uniques, sessions, events. Turns out to be just a few days of work for what amounts to a CRUD service with a worker. Small bootstrapped HTML page to view the stats, no pretty graphs or anything, just numbers. The client code is around 300 lines, basically a simple network request queue. For comparison the latest libGoogleAnalyticsServices.a comes in at ~35mb (wtf?).
5 comments

Analytics tools like Amplitude give you vastly more and more-useful information than just sessions and events. Sure, it's one thing to capture events, but fast and easy search and segmentation of those events? Building ad-hoc funnels and visualizing behavior of cohorts over those funnels? This is real and significant value.

There's a reason that there are profitable companies dedicated to building stuff like this. It's really hard to get right in a scalable way.

Unless you're already a multi-billion dollar corp (and how did you get that big without analytics), it's a no-brainer - this is something to buy, not to build.

Agree, building this in 2021 is not a good use of data engineering time.

As well as the SaaS packages like Amplitude and Mixpanel, you also have great open-source tools and platforms for mobile and product analytics like PostHog (https://posthog.com/), Countly (https://count.ly/) and Snowplow (https://snowplowanalytics.com/).

Disclosure: Snowplow co-founder.

I regularly see a lot of GA analytics on HN.

How does your solution compare to Matomo ? I feel the interface is extremely dated and not intuitive but that just might be me.

Why is Matomo so rarely cited ?

cofounder of analytics company says it's not worth building an analytics solution.

I welcome the competition.

While I do agree, this stuff is vastly more complex then just what the OP said.

But funnels are nothing more then how many people did Y after X.

If you record everything (say every single web request) you could easily do a query to find out that type of data and shove it in a table.

collecting the data isn't the problem, its the analysis thats annoying. ie. this only gets you so far:

> Small bootstrapped HTML page to view the stats, no pretty graphs or anything, just numbers.

Maybe I want to see the number of some custom event per week, ok, now show me number per session, then break it down by country etc etc. I guess you could export the data to some BI tool and create a bunch of reports, but that is a bit of a hassle, especially since in anything but a very small company the people who most want to do this analysis are going to be non-technical. Are you going to assign engineering resources to sit around making reports full time? Much easier to just use google which 1) has all this built in 2) people already know how to use

> Maybe I want to see the number of some custom event per week, ok, now show me number per session, then break it down by country etc etc.

There's this thing called SQL that the oldies used to generate such reports ... Jokes apart, I agree with the suggestion to build your own analytics, or host one using some popular open source analytic software. There is more privacy awareness among users now, and sharing your data with Google or Microsoft or Facebook is the easiest way to hurt your reputation with these individuals. Has everyone forgot that webservers still generate their own visitors logs? Moreover, if you think the data is important enough to collect and analyse, it seems quite foolish to trust a third-party with it, especially a free one.

Unlike everyone else here, I agree. In most cases you can build your own analytics.

Think about what kind of analytics you need, and then decide if you should buy something complicated or just build something simple yourself.

For example, for us crash reporting and knowing wich system version to support is important, so we built our own analytics system for that.

We don't need detailed usage statistics. It's just not very useful for us. We know that people use features that we build, the question is which features are missing! Why are people not able to use our app?

No amount of analytics on our existing users is going to tell us that, so getting powerful analytics is just not important for us.

It's weird if the policy is you can do whatever you want as powerful as you want if you roll your own, but you can't do anything no matter how simple if you use a free third-party SDK, but... is that what it looks like?
No sorry, that isn't it at all but it's easy to see how you would think so.

You can do whatever tracking you like, however you like using whatever SDK you like, as long as you obey basically two rules:

* If you are tracking identifiable information you must present an opt-in dialog. It doesn't matter if you roll your own or use an SDK to do it.

* If you are doing 3rd party tracking, and generally these SDKs are designed to do that, you must indicate this in the App's privacy information.

Honestly, that's all there is to it. The whole article, and the "de-factor ban' narrative being pushed so hard on this forum are predicated on the assumption that doing either of these things is simply inconceivable. However if you are willing to do them, all the restrictions melt away. It's really as simple as that. Ask the users and tell them the truth. That's all that's being required.

The only complications come if you want to weasel your way out of doing these things. That's why the blog post author, and the 'de-facto ban' crowd here think this is such a fiendishly difficult problem.

Waiting for a open source GitHub version