Y
Hacker News
new
|
ask
|
show
|
jobs
by
sebazzz
2432 days ago
Wouldn't any decent HTML library, probably already used by the crawler, convert that back to plain text?
1 comments
onion2k
2432 days ago
If you're crawling the web
looking for email addresses
you're probably not bothering to parse the HTML. You don't need to: you can just grab the email from the raw response from the web server, along with any new links to follow.
link