|
|
|
|
|
by ghoomketu
906 days ago
|
|
I recently downloaded ollama on my Linux machine and even with 3060 12gb gpu and 24 GB Ram I'm unable to run mistral or dolphin and always get an out of memory error. So it's amazing that these companies are able to scale these so well handling thousands of requests per minute. I wish they would do a behind the scenes on how much money, time, optimisation is done to make this all work. Also big fan of anyscale. Their pricing is just phenomenal for running models like mixtral. Not sure how they are so affordable. |
|