Hacker News new | ask | show | jobs
by Alifatisk 930 days ago
> One warning to note is that gron burns RAM. I've killed 32GB servers working with 15MB JSON files.

That sounds seriously like there is something wrong with the tool

1 comments

I think it's because Go's encoding/json package doesn't support incremental parsing.
It does - I patched that into my local fork of `gron` two years ago.
Just done a test with my 800MB stress test file.

`jq`: 1m26s 21G resident

`mygron -e --no-sort`: 18m14s 19M resident

`gron --no-sort`: 1m51s OOM killed at 54G resident

Can you try https://github.com/adamritter/fastgron as a comparision?
`fastgron`: 8.5s 2.2G resident

edit: Interestingly whilst doing this test, I piped the output into `fastgron -u` (39.5G resident) and `jq` rejected that. Will have to investigate further but it's a bit of a flaw if it can't rehydrate its own output into valid JSON.

I released fastgron v0.7.5 which contains fixed in string escaping. Could you please take another look?
Thanks, it's a clear bug. I created a new issue for it: https://github.com/adamritter/fastgron/issues/19
> `gron --no-sort`: 1m51s OOM killed at 54G resident

Oh dear

If I remember correctly, it took a 128GB AWS EC2 to parse that file without OOMing. Go is not that efficient at deep multi-level size- and type-unknown data structures.
Thanks for the follow up. Is your fork public?
I think https://pkg.go.dev/encoding/json#Decoder do support steaming at least. Here is gojq's stream mode https://github.com/itchyny/gojq/blob/main/cli/stream.go
Is there an issue on this?