Hacker News new | ask | show | jobs
by speedgoose 980 days ago
Llama2 models are the standard models to use those days. But that can change fast.

Mistral is a recent strong contender, but they only released a "small" large language model for now.

Depending on your tasks, some BERT models can give better results too.

ChatGPT4 is supposedly very large, but we don't know much about it. It's not open or available outside SaaS on OpenAI or Microsoft Azure.

If you don't plan to do research on the LLMs themselves, but more do applied research with LLMs, ChatGPT4 using the OpenAI API may be a good start. And perhaps don't buy costly lab equipment yet, you can experiment and gain experience on the small quantised models first.