Hacker News new | ask | show | jobs
by Mikhail_Edoshin 2060 days ago
XPath always was extensible, at least at the implementation level. E.g. in 'lxml' it's trivial to add XPath functions with Python. Homegrown, of course, but still possible. In addition to extension elements this is about the only way to hook XSLT into the rest of the system. How else one is supposed to read environment variables from XSLT? The only other way is to pass everything via command line as parameters.

It's insecure to run untrusted XPath, but isn't it same with untrusted anything? A good solution here could be a way to sandbox such XPath, i.e. to limit which functions can be called, the same way it's done with XML where you can forbid the processor to use network or access arbitrary files on case-by-case basis.

1 comments

> How else one is supposed to read environment variables from XSLT?

Setting aside whether it’s even a good idea to allow XSLT to do that, XPath is only a subset of XSLT, so you’re just changing the subject. The “path” in XPath should be a hint at what it’s supposed to be: a query language to select nodes by path in XML documents. As opposed to an alternative of Awk, or Perl.

I'd say XPath a way to get a nodeset or another XPath type out of something. E.g. the current date is not selected from a document. There always will be a need to get yet another thing as a nodeset, e.g. list a directory. Or, for boolean expressions, there will always be a need to test yet another thing, such as an environment variable.

These things, of course, should come as extension functions rather than special syntax, but then there will be a need to provide a small standard library of such functions :)

So yes, I believe it's useful if we're going to use XPath in a trusted environment, e.g. as a typical command-line tool. You won't deny Bash or Python this and other powerful abilities, will you? But of course it would be very unwise to run an untrusted Bash script.