Hacker News new | ask | show | jobs
by vbezhenar 129 days ago
An interesting thing is that most webpages are generated using text templates. There's some text processing like escaping special characters, but it's mostly text that happened to be (somewhat) valid HTML.

So extracting information from this text with regexps often makes perfect sense.