Great repo, glad y'all are looking into this. So am I reading correctly that Intel has a 7B model that doesn't remarkably well with not hallucinating??
Some people are surprised by smaller models having the ability to outperform bigger models, but it's something we've been able to exploit: if you fine tune a small model for a specific task (e.g. reduced hallucinations on a summarization task) as Intel has done, you can achieve great performance economically.
Some people are surprised by smaller models having the ability to outperform bigger models, but it's something we've been able to exploit: if you fine tune a small model for a specific task (e.g. reduced hallucinations on a summarization task) as Intel has done, you can achieve great performance economically.