It has always been the Cadillac of search, and moreso with unstructured indexing (e.g. key collisions with different data structures. Foo = string vs foo = integer vs foo = array).
Your queries or infrastructure were not optimized. It’s very fast when optimized.
Nope, sports betting. I was on the operations side of things and Splunk was something the corporate side organised and championed, but it just couldn't be used to troubleshoot issues in the time frames we needed.
Very very fast, I can do an all time search on terabytes of data in seconds.
But you have to learn to use it, if you don't give it an index and a sourcetype that will slow it down, and like ES leading wildcards slow things down. The fastest searches are simple terms like a word or an IP.
40 minutes sounds exceptionally bad, but 5-10 minutes with splunk was totally common when I worked at Apple almost a decade ago, and I could never figure out why because I only ever used it for O(grep on a log file on disk) level operations. I was probably holding it wrong or maybe the infra team had misconfigured it, idk.
I personally brought Splunk to Apple in 2010 alongside a small handful of people (Hi, Sean and Ariel!). There is a massive difference between real-time searches where latency beyond a few seconds is unacceptable, and historical searches which can take a bit longer. I can assure you that it did its job spectacularly, much to my chagrin that there are few competitors to this day.
Cool! That makes sense. I only really used it as part of the (now defunct I think) "orchard" internal hosting platform that was very much beta when I was using it, for tiny internal apps running on like 4 instances at most, and missed being able to just grep log files; my wild, uneducated guess from what you said is that there was some kind of pooling of our meager logs with other people's from orchard, or we were otherwise off the happy path.
I was part of Orchard and miss it dearly. It had a lot of potential but it was launched as a proof of concept built on pooled resources freed up from optimizing legacy workloads (namely, moving Siri from VMware to Mesos).
It never got the love it deserved and I could absolutely believe that its Splunk cluster suffered as a result. RIP
100% agree, the idea of an internal Heroku was a great one, it just didn't seem to work with how Apple was designed organizationally or something and seemed under resourced.
We recently transitioned to it at Notion and it’s been very fast, outperforming the previous log vendor substantially while offering better search and UX. If you used the on-prem version, the cloud version is quite a different experience.
I don't know which version we used, just that it was managed and configured by Splunk.
We were only sending a small subset of our logs to it so about 200+ GB a day.
Our Linux box with spinning disks could grep the full set of logs much faster than querying Splunk, so I don't think anyone really used it.
Yes BUT it needs tuning. Splunk is complicated and takes continuous maintenance to optimize speed.
I work as a Splunk integrator and here's what I often see:
1. Customer installs Splunk with a qualified Splunk or third-party architect team. The deployment works well.
2. Customer adds infrastructure to the deployment. Splunk slows down. License costs go up.
3. Customer chooses between outside help or DIY. DIY rarely works.
4. Customer now needs outside help. Now Splunk is very slow and expensive, and now it will cost a lot to tune it.
Splunk, the company, is in a tough spot for several reasons: rotating c-level cast, unpopular changes to license model, bad acquisitions. The product is still best in class but tough to keep optimized.
Anecdotal - I took over a small, ill maintained Splunk installation at $JOB-2 and reworked it following Splunks current best-practices and it ran like a top as of when I left that place. Having done that process I'm fully convinced that if you're going to run Splunk on-prem you need a dedicated sysadmin for it that knows Splunk's stack. And that kind of person isn't cheap to hire or keep in that role.
we had an on-prem splunk implementation and it was SOO SLOW.. it was built/managed by splunk and its consultants.
We finally got rid of it a few years later, but for the entire time we had it, it was a constant "round hole square peg" problems. Each time the consultants assured us Splunk could do what we needed, each time it could not.
Just coming back around to this, we also used Splunk consultants for their SIEM solution and the first one we got wasn't very good, but the second was amazing (I wish we could have hired her directly).
The guy we had help us tune our clusters after I rebuilt them all was also very good. Fortunately I'd done most everything by the books and we overkilled the nodes with hardware (we had some older hypervisor nodes lying around I stole for Splunk).
Your queries or infrastructure were not optimized. It’s very fast when optimized.