Hacker News new | ask | show | jobs
by spiralx 1745 days ago
The element(s) before an element: //h3/preceding-sibling::p[1] Match something's parent: //title/.. Match all ancestors: //title[@id = 'abc']/ancestor::comment

Element with src or href attr: //[@src or @href] or multiple conditions: //article[@state = "approved" and not(comments/comment)]

Element with more than two children: //ul[count(li) > 2] Element with matching descendents: //article[//video]

Element text containing substring: //p[contains(text(), "Foo")] Attribute containing substring: //a[ends-with(@href, ".jpg")]

Numerical attribute selection: //product[@price > round(2.5 @discount)] //product[sum(//[starts-with(name(), 'price-')]/@price) > 0]

Attribute values: //a/@href Text values with spaces normalised: //a/normalize-space(text())

Match all attributes or elements or text nodes: //user/@ or //user/node() or //user/text() or //user/comment()

Basically from any node in a document you can select its ancestors, children, descendants, siblings, attributes etc, and filtering has the same power as selecting does - in CSS there's :not() that can apply to selection or filtering, with :has() finally on the way and no :or(). CSS selectors match against HTML elements and they're great for that almost all of the time, but while you can filter by attribute value including substring and even by regular expression, for text there's :empty.

But for a query syntax you need to be able to select attributes and text content as well as elements. Either extend XPath to support #id and .class syntax

//#user-xyz//note/text() //code.language-js/@name

or extend CSS to at allow selecting attrs and text

#user-xyz note :text code.language-js @name

The former is more powerful, the latter a quick hack (if they only appear at the end of the selector anyway) with instant payoff.