| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mskar 471 days ago
	We measured PaperQA2 (https://github.com/Future-House/paper-qa) against the science portion of the RAG-Arena benchmark (https://arxiv.org/abs/2407.13998), it's the first time we've compared PaperQA2 against other systems based on Cohere or Contextual.ai. PaperQA2 achieves a 12.4% higher score than Contextual.ai on the same dataset (1,404 questions and 1.7M documents). We're thrilled about this because it's open source, and getting better every day -- check out the code to reproduce this result in our cookbook here: https://futurehouse.gitbook.io/futurehouse-cookbook/paperqa/....