Hacker News new | ask | show | jobs
by lmeyerov 1875 days ago
it varies.. a lot of our users look at say 50kb files for quick small and targeted visual sessions , but when doing something like a log dump analysis, we are working on TB files and 1-2 GB per streaming part is good. CPU arrow people like to do say 10KB-1MB per record batch, but GPU land is a lot faster by thinking in terms of bandwidth, and so 500MB-10GB per contiguous part, depending on GPU memory and working set size. likewise, depends on how compressed it is, as you ultimately care how much it uncompresses into for the downstream memory pressure. hope that makes sense!
1 comments

You run TB files against GPU? Hmm... that's something I've never thought off. Interesting, any idea where I can research into?
rapids.ai
Should have added: Graphistry talk @ https://pavilion.io/nvidia