Y
Hacker News
new
|
ask
|
show
|
jobs
by
esperent
848 days ago
GPT3.5 and GPT4 are not the only options though, right? I don't follow that closely but there must be other models with longer context length that are roughly GPT3.5 quality by now, and they even probably use the same API.
2 comments
wkat4242
847 days ago
I don't really know. The benefit of ChatGPT is that it's so big, there are so many nice APIs for it :)
I'm not so deep into it all.
link
ajcp
848 days ago
Mistral 8x7b has can handle context of ~32,000 pretty comfortably and it benchmarks at or above GPT3.5
link
ComputerGuru
847 days ago
Is that the sliding context window size? Because I didn't have good results with sliding context windows in the regular Mistral models.
link
ajcp
847 days ago
Yeah, I think they fine-tune without a specific window size target to achieve and then keep expanding context until it starts falling over.
link
I'm not so deep into it all.