Hacker News new | ask | show | jobs
by annowiki 851 days ago
How do you get around 403/401's from WSJ/Reuters/Axios? Because I've tried user agent manipulation and it seems like I'd have to use selenium and headless to deal with them.
2 comments

Sometimes you also need "Accept: html" I have noticed.
If curl-impersonate works, it's probably TLS fingerprinting.