Hacker News new | ask | show | jobs
by icholy 930 days ago
I think it's because Go's encoding/json package doesn't support incremental parsing.
3 comments

It does - I patched that into my local fork of `gron` two years ago.
Just done a test with my 800MB stress test file.

`jq`: 1m26s 21G resident

`mygron -e --no-sort`: 18m14s 19M resident

`gron --no-sort`: 1m51s OOM killed at 54G resident

Can you try https://github.com/adamritter/fastgron as a comparision?
`fastgron`: 8.5s 2.2G resident

edit: Interestingly whilst doing this test, I piped the output into `fastgron -u` (39.5G resident) and `jq` rejected that. Will have to investigate further but it's a bit of a flaw if it can't rehydrate its own output into valid JSON.

I released fastgron v0.7.5 which contains fixed in string escaping. Could you please take another look?
Already commented on the github issue but will have another look, yep.
Thanks, it's a clear bug. I created a new issue for it: https://github.com/adamritter/fastgron/issues/19
> `gron --no-sort`: 1m51s OOM killed at 54G resident

Oh dear

If I remember correctly, it took a 128GB AWS EC2 to parse that file without OOMing. Go is not that efficient at deep multi-level size- and type-unknown data structures.
Thanks for the follow up. Is your fork public?
I think https://pkg.go.dev/encoding/json#Decoder do support steaming at least. Here is gojq's stream mode https://github.com/itchyny/gojq/blob/main/cli/stream.go
Is there an issue on this?