|
|
|
|
|
by franga2000
1655 days ago
|
|
A big part of this is that outbound connections are constantly changing. I tried to firewall a web scraper a while ago that as part of its operation had to execute some untrusted JS. It sounded easy - it only ever connects to one site, so just let that pass through. But that site used shared hosting of some kind and their IP address would change on a surprisingly regular basis. The scraper didn't care at all since it used DNS, but firewalls can't do that. The solution we ended up implementing was to run the scraper through a local HTTP proxy, block all other connections, then use the proxy's config to whitelist the site by the Host header.
This, of course, meant doing SSL stripping on the proxy, which was only acceptable because the proxy was ours. If a hosting provider suggested something like this we'd laugh them away. |
|