Hacker News new | ask | show | jobs
by fallingknife 410 days ago
Where is the cost coming from? Wouldn't a crawler mostly just accessing cached static assets served by CDN?

And what do you mean by your search infrastructure? Are you talking about elasticsearch or some equivalent?

1 comments

No, in our case they were indexing job posts by sending search requests. Ie instead of pulling down the JSON files of jobs, they would search for them by sending stuff like “New York City, New York software engineer” to our search. Generally not cached because the searches weren’t something humans would search for (they’d use the location drop down).

I didn’t work on search, but yeah, something like Elasticsearch. Googlebot was a majority of our search traffic at times.