Hacker News new | ask | show | jobs
by jedschmidt 3750 days ago
Sure, as it could with any part of your app, including wherever your JSON.parse code lives.
2 comments

But could JSON.parse() fed with malicious data fire off an XMLHttpRequest or delete all of your data?
No. That's the whole point of using JSON.parse() instead of eval(). JSON is defined as a non-executing subset of JavaScipt syntax, one that contains only literal expressions. JSON.parse() will only parse valid JSON.

This is why, for instance, there's no native Date format in JSON. Dates in JavaScript require running a constructor -- new Date() -- so they aren't in JSON.

That would concern me too potentially, though if there isn't one already it should be easy to implement a switch that would turn this part of the behaviour off.

It looks very interesting to me from the point of view of dealing with certain data types better (at all, in fact), and handling circular references.

> it should be easy to implement a switch that would turn this part of the behaviour off

I think it's one of those "easy on the surface, but surprisingly complex" problems.

When you start allowing arbitrary code execution, it's a lot of work to prevent certain "functions" (i mean that in the non-programming way).

I was thinking more along the lines of turning it off at the generation side: only including anything in the persisted state that is safe.

Trying to detect the unsafe parts so they could be turned off after that point would be impossible as you suspect (the task would be constrained by the halting problem, https://en.wikipedia.org/wiki/Halting_problem).

It seems like it theoretically shouldn't be too hard to add the ability to validate that the data is a valid lave output if you're concerned about that. That more or less leaves only the issue of anonymous functions in the data being replaced with malicious functions, but frankly the only reason to be serializing functions is if they're user input, otherwise you should instead be serializing function name/key strings or some other well-defined form of function references and/or arguments.
Any type of code-serialization tool will be vulnerable to injection. This is why use of pickle is often discouraged in Python, in favor of serialization formats which don't deserialize to code. Anything that marks "valid output of the tool" could just as easily be produced by an attacker who uses the tool to serialize their malicious code, and even signing/secret-token systems aren't a guarantee since it's so incredibly easy to build or use them the wrong way.
I meant that your code could parse the supposed lave code before running it to verify that it is limited to the known lave constructs (which does not include arbitrary code execution). It would quite slow but enough to make it somewhat safe against an attacker providing malicious lave code.
If lave generates a well defined sublanguage, I don't think parsing would need to be much slower than parsing JSON. It would just be an extended JSON parser that happens to parse executable JavaScript.
OK, so put a ring on it.

And by ring, I of course mean HMAC.