Hacker News new | ask | show | jobs
by jiggawatts 564 days ago
A part of the problem is that the ingestion is not vector compressed, so they're charging you for the CPU overhead of this data rearrangement.

It would cut costs a lot if the source agents did this (pre)processing locally before sending it down the wire.

1 comments

We should distinct between compression in transit and at rest. Compressing a larger corpus should yield better results in comparison to smaller chunks because dictionaries can be reused (zstd for example)