Hacker News new | ask | show | jobs
by alexsb92 5197 days ago
But in that case wouldn't you be looking to get the essence, the plain text useful stuff of an HTML document, in which case wouldn't parsing using regular expressionism or something be better than NLP? I haven't really done scraping and parsing of documents/text so I'm not too sure.
1 comments

It's possible yeah, though I like the formatting and highlighting and borders etc, it groups the different sections of the instructions together.

I see what you mean though, it's not really full NLP either way, I just used that term in place of regular expressions because it was in the NLP class that I learned about them (first homework is a phone and email scraper.) Probably my fault for using semantics wrong.