Y
Hacker News
new
|
ask
|
show
|
jobs
by
bhelkey
512 days ago
Have you tried Ollama [1]? You should be able to run a 8b model in RAM and a 1b model in VRAM.
[1]
https://news.ycombinator.com/item?id=42069453