Hacker News new | ask | show | jobs
by esperent 848 days ago
GPT3.5 and GPT4 are not the only options though, right? I don't follow that closely but there must be other models with longer context length that are roughly GPT3.5 quality by now, and they even probably use the same API.
2 comments

I don't really know. The benefit of ChatGPT is that it's so big, there are so many nice APIs for it :)

I'm not so deep into it all.

Mistral 8x7b has can handle context of ~32,000 pretty comfortably and it benchmarks at or above GPT3.5
Is that the sliding context window size? Because I didn't have good results with sliding context windows in the regular Mistral models.
Yeah, I think they fine-tune without a specific window size target to achieve and then keep expanding context until it starts falling over.