What representation are you using for each site (ie a sparse vector, full text, etc)? How do you compare newly visited sites to old ones?