|
|
|
|
|
by zozbot234
65 days ago
|
|
On-device agentic use is orders of magnitude harder than simple chatting (which is still slow for SOTA), it uses up a huge amount of context and tokens on reading code and reasoning through it. It's sort of viable if you just set it to work overnight on some completely vibe-coded stuff, but that has very middling results. Giving feedback to the model interactively is completely out of the question. Where open models can make a difference for agentic use is with third-party inference at scale, which can actually be fast enough for reasonable workflows. |
|