|
|
|
|
|
by jonshea
5127 days ago
|
|
We do the simplest thing we could imagine. Our sitemap of ~50,000,000 entries is written to static xml once a week as part of a batch job and pushed to S3. Is there any reason to believe it needs to be updated near real time? How often does Google read yours? |
|
but more important:
also with 50 000 000 URLs, as your site gets crawled with about 500 000 pages a day (which is average) or 1M pages a day (which is good) it takes already 50 to 100 days to index your whole site - so it makes sense to communicate only the changed sitemaps (at the exact time when they changed) to google, as the sitemaps get fetched quite fast you up your chances, that the new LP gets crawled/indexed faster. it depends on how fast your page turnaround is (new pages, updated pages, deleted pages) if it makes sense for you, or not.
(p.s.: in most cases for most business, a (near) real-time sitemap is overhead.)