Hacker News new | ask | show | jobs
by cones688 4054 days ago
why not just do both?

our product stores all the logs raw in flats files on the file system, we don't use databases for keeping the logs in, this allows you to scale massively (ingestion limit is that of the correlation engine and disk bandwidth). You then just need an efficient search crawler and use of metadata so search performance is good too.

Issue is if you every need to pull the logs for court and you have messed with them (i.e. normalized them and stuffed them into a DB) then your chain of custody is broken.

Best of both worlds means parsed out normalisation so I don't have to remember that Juniper calls source ip srcIP and Cisco SourceIP, but the original logs under the covers for grepping if you need.