| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by beaker52 58 days ago
	It doesn’t really come as a surprise to me that these companies are struggling to reliably fix issues with software which relies on a central component which is nondeterministic. But they made their own bed with that one.

3 comments

ljm 58 days ago

I've noticed a lack of product cohesion in general and it does make me wonder if it's a result of dogfooding AI.

For example, chat, cowork and code have no overlap - projects created in one of the modes are not available in another and can't be shared.

As another example, using Claude with one of their hosted environments has a nice integration with GitHub on the desktop, but some of it also requires 'gh' to be installed and authenticated, and you don't have that available without configuring a workaround and sharing a PAT. It doesn't use the GH connector for everything. Switch to remote-control (ideal on Windows/WSL) or local and that deep integration is gone and you're back to prompting the model to commit and push and the UI isn't integrated the same.

Cowork will absolutely blow through your quota for one task but chat and code will give you much more breathing room.

Projects in Code are based on repos whereas in Chat and Cowork they are stateful entities. You can't attach a repo to a cowork project or attach external knowledge to a code project (and maybe you want that because creating a design doc or doing research isn't a programming task or whatever)

Use Claude Code on the CLI and you can't provide inline comments on a plan. There is a technical limitation there I suppose.

The desktop app is very nice and evolving but it's not a single coherent offering even within the same mode of operation. And I think that's something that is easy to do if you're getting AI to build shit in a silo.

link

randall 58 days ago

this is "you ship your org chart" not ai.

https://en.wikipedia.org/wiki/Conway%27s_law

link

ljm 58 days ago

Even a distributed or silo'd org chart has some affinity across the hierarchy in order to keep things in overall alignment. You wouldn't expect to use a product suite that is, holistically, not fully compatible with its own ecosystem, even down to not having a single concept of a project. Or requiring a CLI tool in an ephemeral environment that you cannot easily configure.

That's clearly a trade-off that Anthropic have accepted but it makes for a disappointing UX. Which is a shame because Claude Desktop could easily become a hands-off IDE if it nailed things down better.

link

JamesSwift 58 days ago

And the multiple concepts of subscriptions for products, and the idea of MCPs/connectors that arent shared between the different modalities, and the idea of api key vs subscription, and two different inbound websites (claude.ai and claude.com)...

link

lilytweed 58 days ago

Agreed. I use the Claude desktop app almost every day, and have used Code and Cowork since their respective launch dates, and even I still have a really hard time grokking what each is for. It becomes even more confusing when you enable the (Anthropic-provided) filesystem extension for Chat mode. Anthropic really needs to streamline this.

link

notsydonia 58 days ago

YES! I thought it was just me being a bit scattered. But uploading an important file to a project only to have it not there because....<garbled answer from Claude> is distracting to say the least. I don't know what I've enabled offhand but I hate having to stop and try to work out why Claude can't reference a file uploaded to the project in a chat within that project. I think they should pause on all the wild aspirations and devote some time to fundamentals.

link

harha 58 days ago

Add to that that notion mcp works for the chat but not code. now my workflow has docs I comment with others in notion, while the actual work and source of truth is in GitHub.

Need to fall back to codex to keep things in sync, but that's a great opportunity to also make sure I can compare how things run - and it catches a lot of issues with Claude Code and is great at fixing small/medium issues.

link

JamesSwift 58 days ago

Absolutely its dogfooding AI and vibing huge features on the house of cards. Its a fucking mess, and the product design is simultaneously confusing and infuriating. But the product is useful and Im more productive with it than without it now.

link

thaanpaa 58 days ago

Well, the fun part is that the algorithms themselves are deterministic. They are just so afraid of model distillation that they force some randomness on top (and now hide thinking). Arguably for coding, you'd probably want temperature=0, and any variation would be dependent on token input alone.

link

hexaga 58 days ago

Meh. Temp 0 means throwing away huge swathes of the information painstakingly acquired through training for minimal benefit, if any. Nondeterminism is a red-herring, the model is still going to be an inscrutable black box with mostly unknowable nonlinear transition boundaries w.r.t. inputs, even if you make it perfectly repeatable. It doesn't protect you from tiny changes in inputs having large changes in outputs _with no explanation as to why_. And in the process you've made the model significantly stupider.

As for distillation... sampling from the temp 1 distribution makes it easier.

link

LogicFailsMe 58 days ago

Bringing up computational determinism in the early days of AI was absolutely career-limiting. But now, even if the model itself is deterministic for batch size 1, load balancing for MOE routing can make things non-deterministic any larger batch size. Good luck with that guys!

link