|
|
|
|
|
by Smerity
4418 days ago
|
|
For only one million web pages, the job would likely be quite cheap. The Common Crawl corpus is hundreds of millions of pages and, given the right setup, only takes $10 to $100 to process, especially for relatively light entity extraction. More expensive operations, such as parsing using NLP tools, will obviously be more expensive. |
|