| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jerf 4493 days ago
	Ironically, using an existing parser is what opens you to this vulnerability in the first place. If you hack your own together based on a vague idea of what XML really is, you're very unlikely to "correctly" handle entities, you'll probably just put in enough to handle simple XHTML entities, and that makes you immune to this problem! It's the compliant parsers that are vulnerable to this....

1 comments

Peaker 4493 days ago

Or, if you use existing parsers in a language like Haskell, you know parsing is supposed to be a pure function. If parsing suddenly requires IO effects, you can be suspicious and try to figure out what is going on.

link

gamegoblin 4493 days ago

Even with haskell, someone could sneak in a performUnsafeIO call if you aren't careful. Of course this is trivial to detect with compiler flags etc.

link

Peaker 4493 days ago

We're not talking about a malicious XML library here, though. We're talking about a misunderstanding regarding what happens during legitimate parsing of XML.

link

gamegoblin 4493 days ago

I was just responding to you about pure functions. You can make a Haskell function with a pure type signature that includes a call to unsafePerformIO.

link

Peaker 4493 days ago

You can, but:

A) Legitimate libraries don't (unless the IO action is in fact pure)

B) Rogue libraries that do this will not generally work: laziness, optimizations, RTS races can all make the IO action run 0..N times, arbitrarily.

C) It doesn't change the fact that in Haskell, the XML library exposes the weird XML behavior of looking up external entities by being in IO (my original point) -- because of A.

link

jrockway 4493 days ago

More likely, they'll just write bindings to libxml2.

link

jmillikin 4492 days ago

I wrote a libxml2 binding in Haskell (http://hackage.haskell.org/package/libxml-sax). It was an absolute nightmare, in part because handling entities safely requires a lot of hoop-jumping (and I'm not even 100% I caught all the places libxml2 does unsafe stuff).

link

jrockway 4491 days ago

"absolute nightmare" sounds like you did pretty well for libxml2.

link