It doesn’t really come as a surprise to me that these companies are struggling to reliably fix issues with software which relies on a central component which is nondeterministic.
I've noticed a lack of product cohesion in general and it does make me wonder if it's a result of dogfooding AI.
For example, chat, cowork and code have no overlap - projects created in one of the modes are not available in another and can't be shared.
As another example, using Claude with one of their hosted environments has a nice integration with GitHub on the desktop, but some of it also requires 'gh' to be installed and authenticated, and you don't have that available without configuring a workaround and sharing a PAT. It doesn't use the GH connector for everything. Switch to remote-control (ideal on Windows/WSL) or local and that deep integration is gone and you're back to prompting the model to commit and push and the UI isn't integrated the same.
Cowork will absolutely blow through your quota for one task but chat and code will give you much more breathing room.
Projects in Code are based on repos whereas in Chat and Cowork they are stateful entities. You can't attach a repo to a cowork project or attach external knowledge to a code project (and maybe you want that because creating a design doc or doing research isn't a programming task or whatever)
Use Claude Code on the CLI and you can't provide inline comments on a plan. There is a technical limitation there I suppose.
The desktop app is very nice and evolving but it's not a single coherent offering even within the same mode of operation. And I think that's something that is easy to do if you're getting AI to build shit in a silo.
Even a distributed or silo'd org chart has some affinity across the hierarchy in order to keep things in overall alignment. You wouldn't expect to use a product suite that is, holistically, not fully compatible with its own ecosystem, even down to not having a single concept of a project. Or requiring a CLI tool in an ephemeral environment that you cannot easily configure.
That's clearly a trade-off that Anthropic have accepted but it makes for a disappointing UX. Which is a shame because Claude Desktop could easily become a hands-off IDE if it nailed things down better.
And the multiple concepts of subscriptions for products, and the idea of MCPs/connectors that arent shared between the different modalities, and the idea of api key vs subscription, and two different inbound websites (claude.ai and claude.com)...
Agreed. I use the Claude desktop app almost every day, and have used Code and Cowork since their respective launch dates, and even I still have a really hard time grokking what each is for. It becomes even more confusing when you enable the (Anthropic-provided) filesystem extension for Chat mode. Anthropic really needs to streamline this.
YES! I thought it was just me being a bit scattered. But uploading an important file to a project only to have it not there because....<garbled answer from Claude> is distracting to say the least. I don't know what I've enabled offhand but I hate having to stop and try to work out why Claude can't reference a file uploaded to the project in a chat within that project. I think they should pause on all the wild aspirations and devote some time to fundamentals.
Add to that that notion mcp works for the chat but not code. now my workflow has docs I comment with others in notion, while the actual work and source of truth is in GitHub.
Need to fall back to codex to keep things in sync, but that's a great opportunity to also make sure I can compare how things run - and it catches a lot of issues with Claude Code and is great at fixing small/medium issues.
Absolutely its dogfooding AI and vibing huge features on the house of cards. Its a fucking mess, and the product design is simultaneously confusing and infuriating. But the product is useful and Im more productive with it than without it now.
Well, the fun part is that the algorithms themselves are deterministic. They are just so afraid of model distillation that they force some randomness on top (and now hide thinking). Arguably for coding, you'd probably want temperature=0, and any variation would be dependent on token input alone.
Meh. Temp 0 means throwing away huge swathes of the information painstakingly acquired through training for minimal benefit, if any. Nondeterminism is a red-herring, the model is still going to be an inscrutable black box with mostly unknowable nonlinear transition boundaries w.r.t. inputs, even if you make it perfectly repeatable. It doesn't protect you from tiny changes in inputs having large changes in outputs _with no explanation as to why_. And in the process you've made the model significantly stupider.
As for distillation... sampling from the temp 1 distribution makes it easier.
Bringing up computational determinism in the early days of AI was absolutely career-limiting. But now, even if the model itself is deterministic for batch size 1, load balancing for MOE routing can make things non-deterministic any larger batch size. Good luck with that guys!
For example, chat, cowork and code have no overlap - projects created in one of the modes are not available in another and can't be shared.
As another example, using Claude with one of their hosted environments has a nice integration with GitHub on the desktop, but some of it also requires 'gh' to be installed and authenticated, and you don't have that available without configuring a workaround and sharing a PAT. It doesn't use the GH connector for everything. Switch to remote-control (ideal on Windows/WSL) or local and that deep integration is gone and you're back to prompting the model to commit and push and the UI isn't integrated the same.
Cowork will absolutely blow through your quota for one task but chat and code will give you much more breathing room.
Projects in Code are based on repos whereas in Chat and Cowork they are stateful entities. You can't attach a repo to a cowork project or attach external knowledge to a code project (and maybe you want that because creating a design doc or doing research isn't a programming task or whatever)
Use Claude Code on the CLI and you can't provide inline comments on a plan. There is a technical limitation there I suppose.
The desktop app is very nice and evolving but it's not a single coherent offering even within the same mode of operation. And I think that's something that is easy to do if you're getting AI to build shit in a silo.