|
|
|
|
|
by yturijea
3 hours ago
|
|
I am using perhaps 15% of usage count on Claude with just the normal subscription. And I do full time software engineering and would say I use quite a lot of AI input on thoughts, designs and code drafts. So how these companies and people manage to use these absurd amount of tokens is a mystery to me.
It feels like this are just running huge amount of non-vetted data to the LLM's and or running loops against the LLM's which only produce fractional results if not wasted results for insane cost. So really it is the equivalent of just burning money, or heating your house in the winter while having all your windows open. |
|
But try running Claude Opus at API prices through a 'clever' RAG based intermediate system 'managing' a 2024-era context size window completely unaligned with 2026 frontier model tool use expectations, that results in 100% cache miss and content coherency destruction on every single interaction. There's your typical 'Enterprise Agreement' GenAI setup.
I only really discovered this when trying to find out how my Enterprise friends' AI experiences were so completely opposite from my own successes as I could not believe how poor their results were even though on the surface it looked like we were using the same model, and I know they aren't 'bad' software engineers and developers.