|
|
|
|
|
by fulmicoton
1871 days ago
|
|
A normal search experience (displaying a 20 hits search page) requires
num segments * (1 + num terms * 2) + 20 GET requests. We have 180 segments for our commoncrawl index.
So we can consider a generous upper bound of 1000 requests. The GET request costs adds $0.0004 per commoncrawl search request.
Storage costs us $5 per day, so the cost of GET request starts topping storage cost at >10k request per day. Our search engine is meant for searching large datasets, with a low number of queries: Logs, SIEM, e-discovery, exotic big data datasets, etc.
These use case have typically a low daily query rate. For high request rate, (1 query per second) like e-commerce, entirely decoupling storage and compute is actually a bad idea.
For low request rate (< 1000 per day), using S3 without caring about the GET request cost is perfectly fine.
And in the middle, you might probably want to use another object model with a more favorable pricing model. |
|