Hacker News new | ask | show | jobs
by speedgoose 983 days ago
Hi, it depends on your lab.

Because you can afford a few A100/H800, it sounds like you have a descent budget. Do the maths first though. It may be cheaper to rent.

Also you may have access to large scale cluster through your lab. If you are located in Europe and work on European research projects, you may be able to play with one of the EuroHPC supercomputers for example.

https://eurohpc-ju.europa.eu/supercomputers/our-supercompute...

1 comments

Thanks so much for the info!

Another thing is that I'm a little bit confused about all kinds of "LLMs". I looked into hugging face but there are so many different models and they may have variant size or structure. Which model can be a good start? I guess gpt4 is too large to handle in a lab environment (sorry I don't really know much about LLM :).

Llama2 models are the standard models to use those days. But that can change fast.

Mistral is a recent strong contender, but they only released a "small" large language model for now.

Depending on your tasks, some BERT models can give better results too.

ChatGPT4 is supposedly very large, but we don't know much about it. It's not open or available outside SaaS on OpenAI or Microsoft Azure.

If you don't plan to do research on the LLMs themselves, but more do applied research with LLMs, ChatGPT4 using the OpenAI API may be a good start. And perhaps don't buy costly lab equipment yet, you can experiment and gain experience on the small quantised models first.