|
|
|
|
|
by cushychicken
1667 days ago
|
|
Is that still practical even if you're storing the page text? The reason I don't do that is because I have a few functions that analyze the job descriptions for relevance, but don't store the post text. I mostly did that to save space - I'm just aggregating links to relevant roles, not hosting job posts. I figured saving ~1000 job descriptions would take up a needlessly large chunk of space, but truth be told I never did the math to check. Edit: I understand scrapy does something similar to what you're describing; have considered using that as my scraper frontend but haven't gotten around to doing the work for it yet. |
|