|
|
|
|
|
by chubot
2677 days ago
|
|
I agree that's a good strategy for big JSON. Do you know of any such "lazy" parsers? I think the problem is that to extract arbitrary keys, you really need to parse the whole thing, although you don't need to materialize nodes for the whole thing. But if you have big JSON with a given schema, you may be able to skip things lexically. You basically need to count {} and [], while taking into account " and \ within quoted strings. That doesn't seem too hard. I think a tiny bit of http://re2c.org/ could do a good job of it. |
|
https://gitlab.com/philbooth/bfj
The specific function of interest here is `bfj.match`, which takes a readable stream and a selector as arguments:
https://gitlab.com/philbooth/bfj#how-do-i-selectively-parse-...
It still walks the full tree like a regular parser, but just avoids creating any data items unless the selector matches. Though there is an outstanding issue to support JSONPath in the selector, currently it only matches individual keys and values.