|
|
|
|
|
by bryanrasmussen
2066 days ago
|
|
There's really two sorts of web scraping: Brute or Generic Scraping - you need to be able to scrape any site and get the data into your organization to serve to your customers, therefore you probably don't care about manipulating things on a string level and you do care about having something that can handle a JS based site. Here you do not make money from the individual scrapes but being able to have everything for everyone, and thus you cannot afford to spend much extra development effort for a site because scraping that site in itself probably isn't worth much money for you. Bespoke scraping, here you care about being able to extract data at a very atomic level and you need string manipulation and everything else. Probably you make money on each individual site scraped because the sites have been strategically chosen to enhance a product - for example you have a product serving the legal needs of everyone in the EU but you want to expand into all EEA / EFTA countries, each legal info site you adopt your scraper for is worth lots of money and you put developer effort into getting things at a granular data level matching your data model of legal information. on edit: changed minimal to atomic |
|