Has anyone tried this for careers pages? Would be interested in how this performs on a random sample of ~50 crunchbase NYC startups’ careers pages. I dunno how much time would have to be spent training data...
We did :) It works on all kind of pages. You just have to set it up on one page and it will work on all similar pages of the website. Did you have in mind to train a model to recognise careers pages across websites ?
Yeah, that would be really helpful. I want to monitor careers pages of all local companies in the Crunchbase NYC geo in order to help candidates search for local companies by keywords (eg C#). We have an API already (syncs with Algolia) to receive the jobs, with unique key on each job’s URI; and we wouldn’t want to scrape more than once per day.