Hacker News new | ask | show | jobs
by SimianSci 5 hours ago
There is an understandable gap between the capabilities of closed models and those of open models. The current difference is primarily expressed in the cost of hardware necessary to sufficiently run a exactly comparable model. A single higher end graphics card running on your average gaming computer, is capable of running small to medium models that compare with those of their lab-born counterparts in the small-medium range. But the heavyweight models are still outside the realm of possibility for all but the most well-funded individual.

However, I would highly suggest more people experiment with these smaller models. They are incredibly capable in many ways that many people dont realize.

The perceived capabilities of the larger models are also much less the result of the model having more parameters/training cycles, but rather that they are being run through well-made harnesses, something which the open-source community is rapidly approaching with near-peer solutions of their own.

In short, much of the gap between between open-weight models and the larger proprietary models can be considered more of an issue of perception and not an issue of capability. There is a fundamental gap economically, but not so much in capability. The open source community is rapidly closing the gap on these larger labs, especially thanks to the amazing research being freely given openly by well funded chinese labs.