Hacker News new | ask | show | jobs
by ilcavero 4983 days ago
so, how do I protect myself against this?
4 comments

Assign a limited memory pool to the parser, it's in the article.
Most of the xml libraries already have the fix.
Stop using XML.
Use a parser or data structure that doesn't duplicate identical objects.

Functional programming wins here.

You might, but the expansion happens on a level that might be really hard to implement if you want to keep XML semantics. E.g. in the DOM API, each element has its own identity - you can't collapse identical objects (or you could, but the objects generated this way aren't identical, they have different parent nodes for example).
That's hardly a fool-proof solution, because you still end up with a document that's has a very large logical size (even if you represent it in a very compact way in memory) and any non-trivial operation on the resulting string that needs to evaluate it on a character-by-character basis (e.g. a substring search) will still take "forever" to complete.