Hacker News new | ask | show | jobs
by smoove 5459 days ago
>>Would it really? I'd like to see some hard data on that.

The process would be this:

-> User submits Regex

-> Google fetches all documents in it's database (46 billion documents according to mryan) - If we assume 1kb of data per document (wich is probably way to small), google just fetched 43869 GigaByte of data

-> now google somehow iterates over said 43869Gb (we assume we have a lot of RAM btw.) and check if the regex matches any of them

-> Search results are delivered to user (days later?)

I can not give you any "hard facts", but the problem is that if you can not build an index, you have to look at each individual document. And in google's case the amount of documents is just way too high.