Hacker News new | ask | show | jobs
by hanselot 1135 days ago
1 - Yes, I agree on this, but even so, most developers already are investing in SOTA GPU's for other reason (so not as much of a barrier as purported)

2 - Scaling is not a problem in other industries? If you want to scale your food truck, you will need more food trucks, this doesn't seem to really do anything for your point.

GGML and GPTQ have already revolutionised the situation, and now there are tiny models with insane quality as well, that can run on a conventional CPU.

I don't think you have any idea what is happening around you, and this is not me being nasty, just go and take a look at how exponential this development is and you will realise that you need to get in on it before its too late.

1 comments

You seem to be in a very particular bubble if you think most developers can trivially afford high end GPUs and are already investing in SOTA GPUs. I know a lot of devs from a wide spectrum of industries and regions and I can think of only one person who might be in your suggested demographic
Perhaps I should clarify, that when I say SOTA GPU, I mean, rtx 3060 (midrange), which has 12gb vram, and is a good starting point to climb into the LLM market. I have been playing with LLM's for months now, and for large periods of time had no access to GPU due to daily scheduled rolling blackouts in our country.

Even so, I am able to produce insane results locally with open source efforts on my RTX3060, and now I am starting to feel confident enough that I could take this to the next level by either using cloud (computerender.com for images) or something like vast.ai to run my inference (or even training if I spend more time learning). And if that goes well I will feel confident going to the next step, which is getting an actual SOTA GPU. But that will only happen once I have gained sufficient confidence that the investment will be worthwhile. Regardless, apologies for suggesting the RTX3060 is SOTA, but to me in a 3rd World Country, being able to run vicuna13b entirely on my 3060 with reasonable inference rates is revolutionary.