One of my websites has links to nytimes.com. They work fine if clicked on manually. Bernard reports them as a 403. I wonder if NYT is classifying Bernard as a scraper?
The Internet is a wild place, and I reckon 90% of the complexity of a crawler is dealing with workarounds and non-compliant servers (cough www.apple.com cough).