Hacker News new | ask | show | jobs
by sullivanmatt 1248 days ago
This issue exists to the right of your solution and is (for now) out of scope, but the biggest issue I have with security data lakes is the need to (easily) get both row-based data and visualizations. Back when I had access to a well-built and cared for Splunk environment, I would constantly run queries, build visualizations, go back to the results index, tweak the query, go back to viz, etc. This feedback loop is important and allows for fast iteration, especially if you are conducting a high-stakes investigation and need answers rapidly. I should be able to look at my available fields and tweak the viz accordingly in under a few seconds; preferably in one mouse click.

Now I live on an ELK stack and I experience nothing but full-time agony as I switch between Kibana and Kibana Lens constantly. It's clear they are two completely separate "products" built for different use-cases. The experience reminds you constantly that they were not purpose-built for how I use them, unlike Splunk.

Increasingly we are moving towards the reality of a security data lake, and all I can think is that I'm about to lose what little power I had left as I have to move to something like Mode, Sisense, or Tableau which again, were not purpose-built for these use-cases and even further separate the query/data discovery and visualization layers.

I hate how crufty and slow Splunk has gotten as an organization, and they use their accomplishments from 15 years ago to justify the exorbitant price they charge. I really hope the OSS/next-gen SaaS options can fill this need and security data lake becomes a reality. But for that to happen, more focus is needed on the user experience as well.

Regardless, very cool stuff and could definitely fill a need for organizations that are just starting to dip toes into security data lakes. I wish you success!

2 comments

I completely agree with you and the need for a fully integrated solution with great visualizations without hosting additional tools that aren't purpose built! Unfortunately there are very few SIEMs that get this right today..

Here's how we are thinking of it. We think it's important for a successful security program to first have high quality data and this is why we want help every organization build structured security data lakes to power their analysis using our open source project. The Matano security lake can sit alongside their SIEM and be incrementally adopted for a data sources that wouldn't be feasible to analyze otherwise.

Our larger goal as a company though is to build a complete platform that allows a security data lake to fully replace traditional SIEM -- including a UI and collaborative features that give you that great feedback loop for fast iteration in detection engineering and threat hunting as you mentioned. Stay tuned I think you will be excited by what we are building!

For sure. Pull a dbt and get everybody hooked on your tool, then slap a SaaS platform ecosystem to the farthest right and watch the revenue flow.
Splunk is HEAVILY pushing their SaaS offering at the moment. They are the most obnoxious vendor we currently deal with.

We are fine on prem, pay big $$ license fees, but not enough. They want that sweet SaaS revenue.

I would be wary of pushing this, being a non-SaaS platform could be an advantage here.

I’m assuming the difference is: “big $$ license fees” for on-prem is $X a year, while “sweet saas revenue” is $A a year, $B per user, $C for compute, $D for storage, and $E for requests.

As a large company, what are the things you are more than happy to pay for with on-prem?

The reason I’m asking: this feels like the largest issue with cloud saas, which is one of the more popular implementations of open-core for B2B. Not saying Splunk is open-core, but it’s related to above/dbt cloud discussion.

Enterprise customers have the highest propensity to pay, but don’t need or want their cloud offering.

Mid-tier customers actually prefer a managed service by their cloud provider, aws/gcp/azure, because it strikes a balance between easy AND it works within their vpc/iam/devops. But this cuts off open-core companies main revenue, so they start making ELv2 licenses (elastic, airbyte, etc) which makes things harder on mid-tier.

Small customers are the ones who love saas the most, but have the least ability to pay, have the least need for powerful tools, and will probably grow out of being a small customer…

I’m curious if there are any companies which are: source code available, commercial license, allow you to fork/modify the source code, only offer on-prem (no cloud saas offering), want the mega-clouds to offer a managed service. BUT the commercial license requires any companies over 250 employees or $X revenue (docker desktop style) to pay a yearly license fee.

Indeed, no more SaaS. I've had enough of this cloud nonsense already.
Say more? Are you tired of it personally or is it troublesome at work?
The biggest issue I have with data lakes is they always without fail turn into a data cesspool. The more you add the less ROI you get out. And yes using Splunk as an example it becomes an organisational cost problem. I have spent way too many hours arguing with them over billing.

The only viable solution is design metrics into your platform properly from the ground up rather than trying to suck them out of a noisy datasource for megabucks.