One that is based on analyzing the content of a page then on it's page rank.
Self speaking that it has to be open source.
Apache SOLR would be a good starting point.