|
|
|
|
|
by rb2k_
5654 days ago
|
|
The crawler currently runs on a single large EC2 instance.
I could see myself trying to use a bunch of EC2 micro instances instead and then use Riak + Riak Search. I actually tried putting a dump of the data into Riak and it seemed to hold up pretty well on my macbook. Another problem was the fact that Riak didn't allow me to do server-side increments on the "incoming links" counter which mysql, mongodb or redis allowed. However, I think that this is something that could be solved using Redis as a caching layer. I have to admit that I would love to use Riak for something just because it seems to be a really slick piece of software, so it's hard to stay objective :) |
|