|
|
|
|
|
by joeblau
3277 days ago
|
|
This was the key to our data analytics url de-deuping platform back in 2011. We were pulling in 50k social media messages an hour and there were lots of duplicate links running though our pipeline. We had a 100GB bloom filter backed by Redis to keep a list of all links that came though our system and it worked beautifully. |
|