You can create anything you need as long as what you need is a disposable script, a scotch-taped together single page app, or a complex problem and you have thousands of dollars to throw at tokens.
I've been playing with local models for some time, and I've been pleasantly surprised of late. A meager rtx 5080 with 16gb can give pretty good results now. The ecosystem is also improving pretty quickly.
I have a feeling at some point we will have a "Windows 95" moment (when computing really became personal for the masses) in AI, and things will significantly change shape again.
The answer to which ai model, in mid 2026, is always qwen. Depending on your ram, it’s qwen3.5-9b, qwen3.6-35b-a3 in a 3 or 4 bit quant, or qwen3.6-27b. I’m told a bigger model quantized is better than a smaller model unquantized. In 16Gb vram on 10 year old hardware i can run a 3bit quant of qwen3.6-35b-a3 at ~30tokens/sec, and it can do a lot.
I have a feeling at some point we will have a "Windows 95" moment (when computing really became personal for the masses) in AI, and things will significantly change shape again.