Hacker News new | ask | show | jobs
by sethammons 2517 days ago
This is not unlike what we've been doing for years. We generate billions of log lines like this daily as json and inspect them with splunk. By having consistent values across log lines, we can query and do neat things. "What was our system timing in relation to users who have feature x?" "What correlations can we find between users whose requests took too long and were not throttled? -> ah, 99% of those requests show $correlation_in_other_kv_pair!"
1 comments

(I'm the author, and) Yeah, whatever you might call them, canonical lines are an "obvious" enough idea that I'd expect a lot of shops to have arrived at them independently. Besides yourself, I've heard from a number of people where that's been the case.

That said, it's also a surprisingly non-obvious idea in many respects — a lot of people are used to just traditional trace-style logging and never come up with a construct like them, so we felt they were worth calling out as something that might be worth doing.

I feel that logs are around for so long that it's easy to take their capabilities for granted and not go much further. This is another example there's more that can be done. It reminds me rfc5424

At LogSense.com we actually tackled this problem too and came with automatic pattern discovery that pretty much converts all logs into structured data. I actually just posted it here: https://news.ycombinator.com/item?id=20569879 I am really curious if this is something that you consider helpful and any feedback is very welcome

Oh, for sure. A lot of folks are doing really interesting things that others could learn from and they don't stop and write something up that helps the industry grow that much more. I'm glad y'all wrote this up; I might do a similar one!