Hacker News new | ask | show | jobs
by pSYoniK 1079 days ago
What I am curios about is how much people actually use ALL the analytics information provided by a lot of these tools. I know Matomo and other such open source/self-hostable solutions, but how much info do you really use?

I think for most use cases users would want to know if their content is consumed/read. Maybe how long someone spends on it and where they came from. For this sort of stuff you can write a small script to parse your logs. I did something along these lines to parse Caddy logs to get some idea of how many people visit a link. That's really all I needed and the great part is that I run it whenever I want an update, so it's not consuming resources constantly. The logs are cleared and the output is saved before logs are cleared so I know Article 1 had 39 views (or less!) and Article 2 had 5 views and so on...

So I think we're overdoing it and we would benefit from taking a few minutes before going down the rabbit hole of analyzing EVERYTHING.

9 comments

Analytics and Business Intelligence in general tend to play a big part in modern enterprise organisations, at least in my experience. Often what happens with corporations is that the larger they grow, the more risk-averse decision makers become, and suddenly things like analytics become nice foundations to lean on for when a decision is questioned.

What I'd be curious to see is the ROI on these tools. They obviously work in some cases, but do they always work? We currently employ three business intelligence developers, and two developers who actually build products. What's the most hilarious about it, however, is that despite employing three BI's I can't tell you if they earn their keep, because their data doesn't show that.

I'm using plausible Analytics. It logs and displays very little data. And no PII data.

It's more then enough for me.

And a few clients whom I enabled it for, told me they very much liked the simplicity. Less data as a feature!

Plausible rubbed me the wrong way because of the attitude of their staff, but maybe I was the asshole?

I found a bad bug in their JS which means that on some pages it just silently fails and doesn't log anything, which means your analytics are even more inaccurate than ever (given the browser restrictions). I was totally broke and I wanted to use their paid service for a few months, so I offered them the fix in exchange for a few months free service (maybe $30 credit?). They told me basically "don't worry, we'll find the bug ourselves one day, we don't need your help."

I obviously don't know this exact situation.

But I've been in one, where a customer offered "patches", despite our software not being open for contributions. Not only were they inconsistent with our standards, they were hard to read and had some subtle security issues on careful review. I'm still suspecting it was an attempt to plant a backdoor.

In any case, even if legit, it was a lot of work on our side to just review and clean it. Far more than if we just did it ourselves.

This is different for OSS, which should have external contributions as main workflow. Ours wasn't prepared for external contributions.

Maybe the same is with Plausible?

Plausible is great, and they just added conversion funnel analytics last week which was the big feature I was missing.
With SPAs and mobile apps server logs won't be accurate and that much useful.

Tracking events is actually useful to see which features are used etc.

It's not all marketing and evil ads.

Having said that, GA4 is awful as a casual user.

Depends on the SPA, plenty of them fire enough requests to the server for server side logs to be useful.
Or even if you don't want to use your web server's logs for this purpose for whatever reason, this is quite trivial to implement in JS yourself. No need for GA and other bloated analytics frameworks.
If you could trivially implement Matomo (a project that has been developing over 16 years) in JS, please open source it. Would love to get rid of the PHP in our stack.
If you're using the docker container it really shouldn't matter.
Security risks still there.
Well, you still need some kind of backend to store the data. You can send it to a 3rd party, but then you'll run into all the same GDPR issues.
If the website is not just a static html page, there is likely a web-server with a database that can store information.
Oh, sure, but a LOT of websites are static html pages — or, at least, should be.
> all the same GDPR issues

Not necessarily. If I read the article correctly, it is about sending data to the US:

> The complaints allege that the companies, in violation of the law, transfer personal data to the United States.

So if the 3rd party is inside the EU, you might be fine. Or at least you may run into different GDPR issues.

> how much people actually use ALL the analytics information

I sometimes check access logs and pipe some grep queries into a line counter, or uniq by IP address to have a rough idea of how many people look at a particular part of, or tool on, my website. Maybe twice a year or so. Helps prioritise which things are worth maintaining/updating based on what's still being read (found by search engine or linked from third parties)

With a SaaS application, we use it for monitoring customer activity to drive support and sales renewal activity, to determine which features particular customers are using, to determine how they are using it, and how these things are changing over time. It's a vital part of everything we do from a product and sales perspective.
Do you know a static host that makes logs available? I happen to be looking to do something like this right now, but I would rather not run my own web server for my simple static blog.
I recently deployed a static website on Bunny.net using their object storage and their CDN and they make available logs in this format https://docs.bunny.net/docs/cdn-log-format.
Thanks. I hadn't heard of bunny.net before. I'm going to give this a shot.
Nearlyfreespeech.net does. I’ve used them for many years to host static sites.
I’ve been tinkering around on nearlyfreespeech.net for about an hour now, and I love it. Thanks.
You bet! They’ve been good to me.
> For this sort of stuff you can write a small script to parse your logs.

IFF you have access to your logs.

So you can setup GA but you can't check your web server logs?
That covers everything hosted on GitHub Pages, as one example.
Why would you not have access to your logs? Not having access to logs doesn't even make sense to me. We honor hippa and gdpr and we can access logs. Beyond that, I am a proponent of structured logging and log aggregators that can help you see trends and analyze the logs, like Splunk or, to a lessor extent, DataDog.
I decided to not convert to Google Analytics 4 because I used it as a glorified visitor count. I opted for a websocket to measure active users, the page they are on and some basic hourly peak and total user count split out over logged in and anonymous visits.