Personally I don't care about the difference between your two examples because a published URL is still a published URL regardless of whether it is a hyperlink or not.
However where I do care about the difference between an anchor and a paragraph block is with relative links or other URLs without the protocol prefix as those are harder to programmatically guess what is a web URL and what is an example file system path in a technical document (for example). In an ideal world the JavaScript, CSS and any web-APIs (eg JSON returns) would be executed locally to check any modern way of abstracting away URLs (page redirection et al). But that's not to say there isn't a place for a less sophisticated parser (though I would say that as I've also written a link checker similar to the one posted hehehe).
Like, extreme performance, or just parallelism? One example of parallelism: xargs -a urls.txt -n 5 -P 20 wget -nv --spider -T 10 -e robots=off. This will run up to 20 processes with 5 URLs each. It's not "efficient" but it's faster than nothing, and you get the whole feature set of Wget.
Here's an interesting read on crawling a quarter billion pages in 40 hours: http://www.michaelnielsen.org/ddi/how-to-crawl-a-quarter-bil... From my own experience crawling massive dynamic state-driven websites, even if you're trying to just grab a single page, you will eventually want the extra features.