| I'm not a fan of competitors creating benchmarks like this as when faced with any tuning decision, they will usually pick the one the makes their competitors slower. But anyway lets take a look at how they tuned Elasticsearch. Disclaimer I used to work at Elastic! - Used Logstash instead of Beats for simple task of reading syslog json data.
Beats (https://www.elastic.co/guide/en/beats/filebeat/current/fileb...) would have performed better especially around resource usage. - Set very low Logstash heap of 256mb
https://github.com/SigNoz/logs-benchmark/blob/0b2451e6108d8f... - Added grok processor https://github.com/SigNoz/logs-benchmark/blob/0b2451e6108d8f...
Dissect is faster here - No index template configuration
This would cause higher disk usage than needed due to duplicate mappings. Again a Logstash vs Beats thing. For this test more primary shards and a larger refresh interval would also improve things. - Graph complaining Elasticsearch using 60% available memory.
This is as configured, they could use less with not much impact to performance. - Document counts do not match..
This is probably due to using syslog with random generated data vs creating a test dataset on disk and reading the same data into all platforms. - Aggregation queries were not provided in repo https://github.com/SigNoz/logs-benchmark so cannot validate. I'm actually surprised Elastic did so well in this benchmark given the misconfiguration. |
This is also because we are not experts in Elastic or Loki, so we won't know the possible impact of tuning configs. To be fair, we also didn't tune SigNoz for this specific data or test scenario and ran it in default settings.
> Graph complaining Elasticsearch using 60% available memory. This is as configured, they could use less with not much impact to performance.
This is something we discussed about, and have added a note in the benchmark blog as well. Pasting again for reference
> For this benchmark for Elasticsearch, we kept the default recommended heap size memory of 50% of available memory (as shown in Elastic docs here). This determines caching capabilities and hence the query performance.
We could have tried to tinker with the different heap sizes ( as a % of total memory) but that would impact query performance and hence we kept the default Elastic recommendation