Hacker News new | ask | show | jobs
by abrodersen 2412 days ago
"The tar archive spec does not specify an upper limit on archive size, but my kernel keeps saying 'no space left on device'. It's obviously a bad kernel."
2 comments

Well, in your example, there is a physical limit on how much space is available. In the case of deeply nested json, we are talking about structures that perfectly fit into memory, and could be decoded in a fraction of a second if only a different algorithm were used.
On one hand, yes. On the other hand, the standard Ruby "json" gem can be made to overflow on just 202 bytes of valid JSON input.
I mean, it'd be a pathological 202 byte JSON, but yeah (I guess it could be a DOS attack of some kind, actually, hmm...)

Ruby is a good example, because the `oj` gem, on presumably the same system, is listed as "infinite." Obviously not truly "infinite", eventually it'll run out of RAM -- but this shows it is _not_ an issue of machine resources really.

As the OP notes, it's because if you implement using recursion, you'll run out of stack in most languages (now let's start talking about tail call optimization...), but this isn't the only implementation choice, you _can_ implement to not have this limitation, and probably should.

Anywhere you accept JSON from untrusted users, you should be catching parse errors anyway.
Tail call optimization wouldn't help with parsing JSON.
Usually the solution is to create an array on the heap and manage the stack yourself.
Nope, it stops itself at 100 levels with a JSON::NestingError.

If you run it as JSON.parse(ARGF.read, max_nesting: false) you get 65492 instead of 101.

> the standard Ruby "json" gem can be made to overflow on just 202 bytes of valid JSON input.

It is not overflowing, but about aborting with with a proper error.

You can argue whether 100 is a reasonable default, but I think it is not too stupid to have a maximum depth here and bail out as soon as possible. Because what will happen if you accept arbitrary nesting? Then next guy in the chain who actually works with the parsed json tree will also have to handle arbitrary nesting. And if you are now not careful, you have some error deep in some quick and dirty user written code (which might actually be exploitable instead of not accepting the json in the first place).

I would say you think you can of the 100 as some kind of input sanitation.

    max_nesting: The maximum depth of nesting allowed in the parsed data structures.
    Disable depth checking with :max_nesting => false. It defaults to 100.
Referring to it in bytes purely serves to spread FUD. It's >100 levels of nesting which is a bit silly.
I did that intentionally, and not because FUD. I hear ">100 levels of nesting", and think "huge document", which as demonstrated here isn't necessarily true. So I said "202 bytes" to emphasize "not huge document"; to emphasize that deep nesting doesn't really imply much about document size.
100+ levels of nesting implies nothing about the size of a JSON file, just that it's an extremely niche use case. OTOH almost all JSON files are 202+ bytes. Couple that with the title and it just leads to confusion.
>niche use case

Like an attack.

Why 202 bytes? Wouldn’t 101 [ characters suffice?
But then it wouldn't be valid JSON.
But it could still cause a naive parser to fall over rather than report the json as invalid.