You (1) are a company who (2) understands the business domain and has an appropriate business plan.
Sadly the reality of funding today makes it unlikely that these two will both be simultaneously satisfied. The problem is that history will look back on the necessary business plan and deem it a failure even if it generates a company that does a billion dollars plus in annual revenue.
This is actually not unique to large language models but most innovation around computers. The basic problem is that if you build a force-multiplier (spreadsheets, personal computing, large-language models all come to mind) then what will make it succeed is its versatility: people want a hammer that can be used for smashing all manner of things, not just your company's particular brand of matching nails. And most people will only pick up that hammer once per week or once per month, only like 1% of the economy if that will be totally revolutionized, "we use this force-multiplier every day, it is now indispensable, we can't imagine life without it," and it's never predictable what that sector will be -- it's going to be like "oh, who ever dreamed that the killer application for LLMs would be them replacing AutoCAD at mechanical contractors" or some shit.
In those strange eons, to wildly succeed, one must give up on anticipating all usages of the software, one must cease controlling it and set it free. "Well where's the profit in that?" -- it is that this company was one of the first players in the overall market, they got an early chance to stake out as much territory as possible. But the market exploded way larger than they could handle and then everybody looks back on them and says "wow, what a failure, they only captured 1% of that market, they could have been so much more successful." Yeah, they captured 1% of a $100B market, some failure, right?
But what actually happens is that companies see the potential, investors get dollar signs in their eyes, everyone starts to lock down and control these, "you may use large language models but only in the ways that we say, through the interfaces which we provide," and then the only thing that you can use it for is to get generic conversational advice about your hemorrhoids, so after 5-10 years the bubble of excitement fizzles out. Nobody ever dreams to apply it to AutoCAD or whatever, and the world remains unchanged.
History is littered with great software that died because no-one used it because the business model was terrible. Capturing $1B of value is better than 0, and everyone understands this. And who cares what history thinks anyway?
OpenAI has spent a lot of money to get their result.
It's safe to assume it will take a lot of money to get a similar result, and then to share it (although I assume bit torrent will be good enough). Once people are running their models, they can innovate to their hearts content. It's not clear how or why they'd give money back to the enabling technology. So how does money flow back to the innovators in proportion to the value produced, if not a SaaS?
If those are all that's required, why don't you start a company with a business plan written so it satisfies your criteria? Then you can lead the way with OSS LLMs.
Yes a rugged individual would have to be incredibly wealthy to do it!
But maybe the governments will make one and maintain it with taxes as an infrastructure service, like roads, giving everyone expanded powers of cognition, memory, and expertise, and raising the consciousnesses of humanity to new heights. Probably in USA it wouldn't happen if we judge ourselves only in zero sum relation to others - helping everyone would be a wash and only waste our money!
The US spends more on its citizens than almost any other country, and more on helping other countries than any other country.
The problem with making something nationalised or a utility is you'd better have made sure there's no innovation needed or risk required. Once that's all settled, then maybe consider it.
Model training is much harder though, because it requires a HUGE amount of high bandwidth data exchange between the machines doing the training - way more than is feasible to send over anything other than a local network connection.
This is the type of task where if you'd want to pool resources, then it would be more efficient to pool dollars and buy compute power rather than pool compute power - I'd assume that if treat the decentralized hardware as free, just the the extra electricity cost of using it is more expensive than just renting a centralized server which can do it efficiently.
My own experience with this was a distributed ray tracer where the server sent the full model to the machines and then each machine would ask for one scan line to do, report back, and then ask for another scan line and repeated.
There was no interaction between the machines - what was on one scan line didn't need any coordination with what was on another scan line.
Likewise, with SETI@home, the server could give you a chunk of data and you could analyze that chunk - the contents of another chunk of data didn't change the analysis being done on this one.
Furthermore, these can be done asynchronously and then assembled when everything is done. Only the very final product / analysis / artifact needs all of the data and nothing other than the end process is waiting on any sub process.
> We are now ready for the second stage. In this stage, we broadcast the next column (mod n) of A across the processes and shift-up (mod n) the B values.
That use of "broadcast" - the matrix multiplication is limited by the speed of the slowest node and it needs to send all the data from the previous calculation to all the nodes making it difficult to use across a network that experiences latency.
When doing ML training, they most of TB/sec of bandwidth... and the high end extremes are in PB/sec ( https://www.cerebras.net/product-chip/ ) ... and I'm sitting here watching Steam download.
The inefficiencies of the network, slow computers, and amount of data transfer to preform the next calculation make network distributed machine learning "not a good choice" at this time.
Sadly the reality of funding today makes it unlikely that these two will both be simultaneously satisfied. The problem is that history will look back on the necessary business plan and deem it a failure even if it generates a company that does a billion dollars plus in annual revenue.
This is actually not unique to large language models but most innovation around computers. The basic problem is that if you build a force-multiplier (spreadsheets, personal computing, large-language models all come to mind) then what will make it succeed is its versatility: people want a hammer that can be used for smashing all manner of things, not just your company's particular brand of matching nails. And most people will only pick up that hammer once per week or once per month, only like 1% of the economy if that will be totally revolutionized, "we use this force-multiplier every day, it is now indispensable, we can't imagine life without it," and it's never predictable what that sector will be -- it's going to be like "oh, who ever dreamed that the killer application for LLMs would be them replacing AutoCAD at mechanical contractors" or some shit.
In those strange eons, to wildly succeed, one must give up on anticipating all usages of the software, one must cease controlling it and set it free. "Well where's the profit in that?" -- it is that this company was one of the first players in the overall market, they got an early chance to stake out as much territory as possible. But the market exploded way larger than they could handle and then everybody looks back on them and says "wow, what a failure, they only captured 1% of that market, they could have been so much more successful." Yeah, they captured 1% of a $100B market, some failure, right?
But what actually happens is that companies see the potential, investors get dollar signs in their eyes, everyone starts to lock down and control these, "you may use large language models but only in the ways that we say, through the interfaces which we provide," and then the only thing that you can use it for is to get generic conversational advice about your hemorrhoids, so after 5-10 years the bubble of excitement fizzles out. Nobody ever dreams to apply it to AutoCAD or whatever, and the world remains unchanged.