| (not at all)OpenAI has a very short time window to monetize and/or lock in their product users. Currently the biggest model one can feasibly run on a desktop pc with let's say a previous gen gpu, 32gb ram, 2x fast nvme drives is approximately 7B. Models comparable in performance to chatgpt are 45B and bigger. In theory it could be possible to run a model like this, but one would wait 5~10 minutes for every answer. Now, consider that those models are going to get optimised and hardware will get better. In few years time you'll be able to run such model on your pc, in few more on your smartphone. What is (not at all)OpenAI going to do? They have to beat the AI safety drum as much as possible hoping they manage to curtail the democratisation of Access to big AIs via legal means. At the same time due to a lack of proper software NVidia is the only game in town for anyone wanting too do inference at home and they're already applying monopoly-level profit margins (50%?) to their products. When is the last time you saw a Google TPU for sale? Ive got my hands on their "edge" tpu. It's nice for things like Cctv object recognition and similar small tasks. I've managed to build a nice 1U Cctv server using it that consumes 30W on average. But I'd like the big version now. I bet the moment alternative frameworks that have good optimisations for both nVidia and non nVidia hardware are starting to gain ground it will suddenly become a lot more difficult to purchase nVidia cards by normal people. They openly say it on their every keynote they want to "rent you everything". This is the biggest battle (except actual physical wars against autocracies) that we have to win in next 50 years to retain our freedom we realistically have in democratic countries. If we allow intelligence (AI) to centralise and be subject to centralised control it'll be game over. The entire global society will be steered as a whole by one "prompt engineer". |
The limiter right now is compute.
If we get to what you’re saying, OpenAI will be developing multi-model models with increasingly large d_model and n_ctx to do a better job of analyzing those images/sounds/videos.
People think ChatGPT is magic now, wait until you can input a picture, sound, and/or video as a prompt…
Don’t fall into the “Information Superhighway” trap of thinking the year 2 form is the final form.