Hacker News new | ask | show | jobs
by simonw 877 days ago
Which Yi 34B finetunes are you using that have a 75,000 token length?
1 comments

All of the Yi 200K finetunes should support it, but you have to be careful because some degrade the base model's quite excellent long context performance more than others. The very strong Bagel 34B DPO model, for instance, basically doesn't work at long context.

Nous Capybara is a popular one. I personally use my own merge of many models, and you can look through the constituent models to see if any interest you: https://huggingface.co/brucethemoose/Yi-34B-200K-DARE-megame...

You can't really use llama.cpp for super long context btw, its just too slow and vram inefficient at the moment.