| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hijinks 601 days ago
	does anyone use this? I'm really starting to get sick of companies that claim they operate at petabyte at scale and find you need to spend 400k a month to support that scale.

1 comments

prabhatsharma 601 days ago

Thousands of active deployments globally.

How many open source log systems work at PB scale given any number of resources? Also FWIW, OpenObserve can ingest data at 28 MB/Sec/Core (We are working on optimizing it even more) and ingesting 1 PB of data would cost just $435 based on on-demand prices (AWS m7g family).

link

terminalbraid 601 days ago

That doesn't answer the question of who? A (rightly) cynical reading of what you posted could just be "thousands of active deployments" you did for yourself to prove benchmarks.

link

prabhatsharma 601 days ago

Machines I would use for benchmarking would go down after some time and won't be active.

link

djbusby 601 days ago

Still didn't answer the "who" part.

link

prabhatsharma 601 days ago

We will publish many names on our website soon.

link

Veserv 601 days ago

Why is it only 28 MB/core-second?

Is that production rate, inbound bandwidth, rate to persistence, rate to processed, or rate to display?

link

prabhatsharma 601 days ago

Compute power is required to process and store the incoming data.

It's not "only 28 MB/Sec/Core". Try doing same with Splunk/Elasticsearch - You won't go past 5 MB/Sec/Core (Typically it will be lower) on their best day.

link

Veserv 600 days ago

To what state?

Suppose I have 28 GB of trace data in memory on a machine and then I fire that off. What do I have after 1000 seconds?

Do I just have a file of 28 GB of raw trace?

Do I have 28 GB of raw trace in memory ready to be indexed?

Do I have a data structure in memory ready to be searched?

Do I have the full trace information rendered on my screen (or a aggregated visualization derived after processing all the data)?

If it is the first, that would be ridiculously slow. If it is one of the latter ones, then it would depend on what querying operations are fast.

28 MB/core-second makes no sense without the context of what you can do quickly after the “processing” is done.

link

prabhatsharma 600 days ago

Too much to give all details in an HN thread. To simplify the conversation, Data will be persisted and usable for individual searches and aggregations. I would welcome you to our slack workspace for any further questions you may have - https://short.openobserve.ai/community

link