|
|
|
|
|
by alangibson
1567 days ago
|
|
> I guess blogs that are linked-to in non-killed HN comments should probably be crawled a bit They are, but there are relatively a few of them because my only page content source is the Common Crawl. The hit rate vs the total urls I'm interested in is not great. I expect to fix this soon. I'm also not indexing entire sites, only specific upvoted urls. This will change as well. > Have you considered using social user karma (this could be a 1-10 score uniquely calculated for users of each of HN, Twitter, Reddit as long as it's built in a modular way) as a weight in a PageRank style schema? Definitely. I've already started in on calculating a rank coefficient for submitters, but it's not completely clear now to best use it yet. > Here's how I am going to evaluate your search engine Feel free to dump more of these. Some solid test cases would be very helpful. |
|