| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by aleqs 13 days ago

Okay, so anthropic has amazing AI which supposedly writes most of their code and can continuously improve... meanwhile they have outages on a regular basis, and any kind of long-running work will now consistently hit 'API Error: Server is temporarily limiting requests'. Not sure of this is intentional to force a reduction of token usage, but at this point I need to build around these throttling limits and outages with my own tools to restart/resume sessions. From my experience, in the last 2 weeks, literally 100% of any non-trivial Claude session/work will now be blocked on these issues, requiring manual intervention.

One of my focuses now is my own model-agnostic, harness and workflow orchestration (I know everyone is building these) , baselining on opus, and aiming to transition to Chinese models like deepseek in the short term and hopefully open, self hosted models in the future (which I plan to open source).

The nonstop marketing fluff from anthropic while their service quality and availability noticeably degrades... just continues to destroy my trust in the company.

21 comments

aagha 13 days ago

And don't forget that they have BILLIONS of dollars and can't figure out how to get a decent support or public communications system setup.

quickthrowman 13 days ago

It’s much cheaper to not offer any support than to offer support. It’s intentional.

It’s important to keep in mind that the less money a company spends, the more profit they make when analyzing their operations.

aleqs 13 days ago

They can't even seem to get their usage metering consistent.

lukan 13 days ago

You mean on some days it goes faster and some other days slower?

That is by design. It depends on how much other people are using their services right now and they do communicate it somewhere in the TOS that they do this. Otherwise they could give us a fixed amount of tokens - but they don't because it is not fixed.

fc417fc802 13 days ago

If they implement demand pricing then they should be transparent about the current rate at any given time.

thinkingtoilet 13 days ago

Don't confuse things. It's not "can't figure out", it's "don't care to figure out". They're not dumb. They just don't care about support.

contagiousflow 13 days ago

Couldn't they just have background agents "figure it out"

collingreen 13 days ago

If agents can just figure it out, isn't that AGI?

selimthegrim 13 days ago

NPCs can’t appreciate that.

jakobnissen 13 days ago

Their outages are probably not due to their code though. It’s probably their infrastructure that can’t keep up. So seeing failures of infrastructure doesn’t really tell you anything about how good or bad Anthropic makes use of their models.

matthewdgreen 13 days ago

The messed up scrolling behavior I keep getting in Claude Code is definitely due to their code.

llbbdd 13 days ago

There is a setting that fixes this, I can't remember what it's called off the top of my head

NichoPaolucci 13 days ago

This concept is so funny to me. Would love a toggle switch...

"Oh yeah, just go to Settings > Bugs Enabled and turn OFF text display errors"

ashdksnndck 13 days ago

CLAUDE_CODE_NO_FLICKER=1

This is a beta feature where Claude code draws the interface on the terminal’s alternate screen buffer like vim or htop. I believe it’s not the default because there are some potential compatibility issues deepening on your terminal setup. I’ve found it to be a nice improvement. It also fixed the issue where copy-pasting selected text from the terminal creates unwanted line breaks.

matthewdgreen 13 days ago

Claude Code is essentially a terminal emulator that runs on mature OSes with excellent support for this type of application. Why are they having difficulty implementing it?

oblio 13 days ago

I've tried about 6 of those "settings" and hacks since November 2025 and not much luck.

Melatonic 13 days ago

The whole thing is actually powered by a shitton of hamsters inside a bunch of 4u rack mount cases running on spinning wheels at high speed. Somehow at scale this works.

Sometimes they all happen to randomly take a nap at the same time - hence the outages

aleqs 13 days ago

That seems like an assumption based on basically nothing. There is a lot of code at the infra layer, and based on the stack choices for Claude code and based on how buggy and unreliable ~everything from anthropic is, it seems pretty bizarre to claim these issues are not related to their code.

keeda 13 days ago

There are other indications, however, like Anthropic paying through the nose for compute just months after Dario told Dwarkesh how hard it is to predict demand, or ChatGPT and Codex not quite having the same issues after Altman spent much-publicized years scrounging for trillion-dollars of capacity.

While I'm very bullish on Anthropic, I'm a bit wary about their IPO because it seems to me that they're filing now while their financials look good and before other trends like the decline of tokenmaxxing and their compute bills catch up.

qwery 13 days ago

Whoa, first name basis with Dario but not Sam. Ouch. [I actually have no idea who Dwarkesh is and it sounds like a first name to me but that's not a particularly reliable indicator so I won't comment on your relationship with Dwarkesh.]

Oh, are they filing now? I think their financials look somewhere in between devastating and criminal, so I'm really looking forward to the IPO!

keeda 13 days ago

Oh, not just them -- Satya, Jensen and I are all on a first name basis. They just don't know it yet ;-)

j2kun 13 days ago

We all saw their code...

bluerooibos 13 days ago

Well, people keep throwing money at them, including you and investors. So why would they care? It hasn't annoyed you or a large enough portion of users enough to move off their service - because there isn't a better alternative.

patcon 13 days ago

Not necessarily the parent's fault, but the energy of this thread is not my favourite...

f311a 13 days ago

Infrastructure is a much harder problem. They can't even improve Claude Code, which eats 1GB+ of RAM. Meanwhile, my editor only consumes 80MB of RAM.

airstrike 13 days ago

This might explain it, in the opposite way it was meant to:

https://fxtwitter.com/trq212/status/2014051501786931427

> Most people's mental model of Claude Code is that "it's just a TUI" but it should really be closer to "a small game engine".

javcasas 13 days ago

> For each frame our pipeline constructs a scene graph with React then

> -> layouts elements

> -> rasterizes them to a 2d screen

> -> diffs that against the previous screen

> -> finally uses the diff to generate ANSI sequences to draw

Yup. Overengineering.

AceJohnny2 13 days ago

This is a decades-old design pattern when CPU >> IO. Emacs has been doing just that since the 80s, when people were complaining about "Eight Megs And Constantly Swapping". See "redisplay" [1]

This minimizes screen flash. You can't rely on terminals doing double-buffering.

[1] https://github.com/emacs-mirror/emacs/blob/c29071587c64efb30... or a more user-friendly overview, Daniel Colascione's seminal "Buttery Smooth Emacs", snapshotted at e.g. https://gist.github.com/ghosty141/c93f21d6cd476417d4a9814eb7...

skydhash 13 days ago

> This minimizes screen flash. You can't rely on terminals doing double-buffering.

GUI and TUI have different architecture model. Most GUI have have a 2D surface that is redrawn multiple times per second. Double buffering is for decoupling update and render. TUI is a grid of characters that are updated one at a time via an active element, the cursor. Double buffering there is very wrong. Like adding airbags to a bicycle.

There’s a reason you see most old TUI either have an option to redraw the screen (automatically like top, or manually) and those that have a scrolling option allow to scroll by page. The TTY (the underlying concepts) used to be slow and it can be slow today as well (ssh connection). You need to be thoughtful about whole screen updates.

strix_varius 13 days ago

lol what? There are definitely ways to make non flashing terminal UIs without this total insanity.

jaggederest 13 days ago

ncurses (new curses) was "new" in 1993...

xiaoyu2006 13 days ago

Even with that, 1G of RAM usage is still not justified.

Melatonic 13 days ago

It's like the Citrix of AI :-D

stego-tech 13 days ago

OOF. As a former Citrix admin, I felt that burn in my bones.

An upvote well earned.

megous 13 days ago

React part maybe. The rest is what any TUI that's using ncurses would do. :)

It really bothers me that most of the TUI harnesses are using 100% CPU quite a lot just printing stuff to terminal. Seems ridiculous.

I guess it comes from syntax highlighting/formatting, which is probably not done incrementally, but over the entire so far displayed block of output, recomputed from the beginning for each new streamed in character. Can't imagine anything else causing the rendering to gradually grind to halt when eg. thinking block is open in opnecode and updates get palpably slow as it grows.

Terminal output itself is fast and consumes almost nothing. You can have 60fps terminal apps that update content every frame and that consume almost no CPU time.

skydhash 13 days ago

> Terminal output itself is fast and consumes almost nothing. You can have 60fps terminal apps that update content every frame and that consume almost no CPU time.

The TUI mode is a client-server architecture. An analogy would be like an html page where all content is updated server side. Try to do 60 fps and you’ll have flickering as well.

megous 13 days ago

No. Fetching pages from remote server will just make the client wait for I/O. That takes 0 CPU load and if the server can't respond at 60fps, lowered redrawing frequency would mean even less CPU load from the terminal redrawing itself.

This does not explain 100% CPU load these harnesses sometimes exhibit.

Aperocky 13 days ago

It's product bloat.

It's not recognizing that they are just one building block that should do one thing well, like tmux.

You don't need a computer display on your fridge for the same reason, but Anthropic think you do. You should see virtual ice getting created and they should correspond to the actual ice behind the door - think of how amazing that is!

And it's not even completely a bad idea. make it claude-code-react-beauty of some way to take it off, it would be far more palatable.

mapBasketWand 13 days ago

I love the idea of installing high resolution cameras in the fridge to monitor the ice maker to feed into a vision model that renders digital ice to the exact position of the real ice on the fridge’s giant screen

Aperocky 13 days ago

See this is the kind of things I hope I'd be doing when I'm retired, but not when I'm shopping.

throwway120385 13 days ago

Or you could... open the door and look inside.

irishcoffee 13 days ago

You mean like… a transparent door? Is that the joke?

yuanBuilds 13 days ago

Yup. For me, this translates to "we are using Ink, the react-compatible TUI framework to build Claude Code"

Animats 13 days ago

What is "frame" in this context? Video frame, or something else?

javcasas 13 days ago

> -> rasterizes them to a 2d screen

> We have a ~16ms frame budget so we have roughly ~5ms to go from the React scene graph to ANSI written.

It looks like video frame, full framebuffer, generated and parsed at 60fps. It surprises me they haven't introduced GPU shaders, 16x oversampling and raytracing. Maybe for next release.

layer8 13 days ago

The contents of the terminal screen at any given point in time.

abletonlive 13 days ago

Care to explain how you'd engineer it instead?

hungryhobbit 13 days ago

Why would anyone ever do that? Make Claude do it!

godelski 12 days ago

Problem is that Claude did do it. If you look at the leak it's pretty clear there's a lot of LLM code

stevenhuang 13 days ago

Not use react native for a cli app for one, lol.

Ratatouille rust cli lib will be a good start.

mudkipdev 13 days ago

A reminder that anthropic has great rust/go sdks that they could have written their own tui in.

munificent 13 days ago

As someone who maintains a roguelike with a terminal-like UI that:

1. Maintains an internal representation of what the game thinks is on screen.

2. Runs the game for one frame which updates that representation.

3. Generates a diff to see how that differs from what's actually on screen.

4. Executes the minimum set of draw calls to get the screen to match the internal representation.

It's really not that hard. It's a few hundred lines of code.

javcasas 13 days ago

Sure. For a videogame.

> -> rasterizes them to a 2d screen

Also you forgot "render to a framebuffer, then parse the framebuffer back to chars".

Anyway, I'm off to construct the new `ls` command. It will render the list of files to a mesh of billions of polygons in a GPU with advanced shaders, 16x oversampling, HDR and all the graphic acronyms I don't understand, then read the resulting image, find the nearest character in the ANSI charset and use that one.

It will be _glorious_ (and profoundly stupid)

ux266478 13 days ago

Could be improved. Encode the image to webp with high compression settings and handle the ASCII mapping by spinning up a local LLM to do OCR on it. Individually. For each cell.

javcasas 13 days ago

Thanks for the idea for V2.0. Hopefully the Claude team doesn't do it first.

munificent 13 days ago

My roguelike's "graphics" are a simulated terminal, so it's a 2D grid of colored characters. It's essentially a TUI, just like Claude Code, except instead of rendering to a real terminal using ANSI escapes, I render to a web canvas using... something probably more complex than what Claude has to do. It's still not hard.

fc417fc802 13 days ago

Vaguely related to your glorious idea. https://www.shadertoy.com/view/NtcGRr

tikimcfee 13 days ago

lol... I know you meant this comically, but you just called me out and it's glorious: https://glyph3d.dev

I built a truly glyph based instanced quad system to render millions of characters in space at once.

applfanboysbgon 13 days ago

I hadn't seen that quote before, what an embarrassing thing to go on the internet and write...

replwoacause 13 days ago

Why the hell does it need to be so complex? People have been making TUIs for decades. Did we need a small game engine to run claude code?

imjonse 13 days ago

They forgot to add 'make it as simple as possible' in the prompt is one possible cause.

On a more serious note using a react-like lib for TUI in the hope you'll share the codebase with the web version is a more likely explanation. Still not the best idea.

javcasas 13 days ago

React is not that stupid to re-render in a loop at 60fps and instead waits for changes to happen before re-rendering. It even batches changes and stuff.

the_gipsy 13 days ago

You don't need React for reactive TUIs - at all. I can understand chosing React for web, but for a TUI it sounds like a really poor idea. And in practice we can see that the claude code TUI is also poor.

comex 13 days ago

It doesn’t need to be that complex, but it can be that complex without being slow. Claude Code’s interface is extremely simple. It has tons and tons of headroom to tack on performance overhead without it being noticeable at all. You just have to not do dumb things like redraw the entire UI every time a spinner spins.

hungryhobbit 13 days ago

"We made our app chew up so many unnecessary resources that we can use even more resources in the future, and no one will notice" is not the strongest engineering idea I've ever heard.

refactor_master 13 days ago

It's like when Bill Gates tried to guess grocery prices. "How much memory does a regular computer have? I don't know, 50 GB? Like a small EC2?"

grogers 13 days ago

It may not be slow, but this crazy complexity is probably a hint at why it can't even scroll up without jumping to the beginning of time.

Quekid5 13 days ago

Must have 120 fps for answers arriving in [buffering] 30 seconds.

wyre 13 days ago

I can't help but think it's their engineer's and PM's making these decisions, since I know that if you asked Claude to write a TUI there is no world it would recommend whatever the frontend architecture of claude code is.

shepherdjerred 13 days ago

It is an excellent example of how LLMs let you try new ideas, even if they aren’t necessarily good ones

qwery 13 days ago

~ "it's not a TUI! <describes an outrageously overengineered TUI> and my dad works at Nintendo"

curses, bud. curses.

It's genuinely difficult to tell how much of this is true. The post is obviously 100% posturing, but some of the words describe things that could be done.

Very few game engines do anything I'd describe as rasterisation. That's kind of the point of a GPU. Well, it used to be. I suppose "small game engines" might be more likely on average to include a rasteriser. The typical reason for this is because the author wanted to write it. Whereas big engine make triangle give hardware go brrr.

So I assume here 'rasterize' means 'printf'. And diffing screens means diffing 50..150 lines of text. And "generating ANSI sequences to draw" means 'printf' with some ANSI sequences interpolated in.

Then there's the frame budget. You have to understand they are operating within a strict frame budget -- they're not messing around, OK. They have a 16 ms frame budget, so they burned 11 ms and now have a (roughly) ~5 ms approx. budget for the final 'printf' in the chain???

fc417fc802 13 days ago

Your broader point is well taken but I thought I'd stop by with some trivia. High end engines such as unreal will rasterize absurd quantities of micro-geometry manually using compute shaders in order to avoid the bottleneck of the hardware rasterizer.

solid_fuel 13 days ago

> High end engines such as unreal

High end engines such as unreal have the excuse of being tasked with rendering millions of polygons, in which case a complex approach makes sense. Claude Code is only being asked to render a few thousand UTF-8 characters.

fc417fc802 13 days ago

Hence my prominent note that it was trivia which implies it to be at least somewhat tangential to the original conversation.

layer8 13 days ago

> For each frame our pipeline constructs a scene graph with React then -> layouts elements -> rasterizes them to a 2d screen -> diffs that against the previous screen -> finally uses the diff to generate ANSI sequences to draw

That’s rather sickening.

Fr0styMatt88 13 days ago

So I’m wondering what ‘rasterizing’ literally means in this case. I imagine it’s just creating a 2D map of elements at a very low (probably character) resolution, then diffing that against the last generated map to come up with an optimal ANSI sequence to send to the terminal, would that be right?

Seems like a cool puzzle to solve. I wonder what the engineering and organisation tradeoffs were that lead to it — does it let them reuse a bunch of existing code?

I wrote a TUI library back in the day for Turbo Pascal — it was essentially taking an immediate-mode approach (which in this context is just a fancy way of saying it was procedural haha).

fluoridation 13 days ago

"Rasterizing" means just one thing in this context: to transform a data structure into an array of pixels. It seems absurd to do this, given that the next step must be to convert back from pixels to text data, but maybe they have some way to generate predictable sequences of pixels (e.g. the character "t" is always rendered as the same pattern of pixels), such that they're cheap to convert back.

If they're doing anything else, the word "rasterizing" is being misused.

fc417fc802 13 days ago

Yes, the much more plausible explanation is that the word rasterize was misused there. They are generating and diffing text data which has been a standard approach to drawing a TUI since the dawn of computing. It is not even remotely resource intensive.

yrds96 13 days ago

I can't still conceive the fact that a tool that only send/receive text from an external API consumes an absurd amount of RAM

dom96 13 days ago

> https://fxtwitter.com

What is this?

nemomarx 13 days ago

Proxy that makes Twitter links embed on discord, for whatever reason. Something about api access without accounts I assume

f311a 13 days ago

It used to allow reading replies without being signed in.

Not sure what changed, but now it just redirects me to x.com.

PunchyHamster 13 days ago

Well it runs on something they didn't design (Electron) using GUI library they didn't design (React)

For company with that much AI you'd think if it was actually good, doing that part in fast and performant way would be "easy"

f311a 13 days ago

It runs in a terminal, it’s not electron

pragmatic 13 days ago

Somebody read/watched too much Casey Muratori.

CamperBob2 13 days ago

No, somebody didn't read/watch enough Casey Muratori.

agumonkey 13 days ago

this allows for comfortable ergonomics IMO

not that it could be leaner for sure but i get the reasoning behind the tui rendering layer

airstrike 13 days ago

comfortable ergonomics? you can't scroll up more than 50 lines before it starts to garble up text

i'd be ashamed of publishing software with this level of polish as a solo dev, let alone as the hottest multibillion startup on the planet

agumonkey 13 days ago

Hmm I thought this was due to me using tmux with claude-code, also it seems that `claude agents` doesn't have this issue.

By comfortable ergonomics, meant the forgiving and asynchronous input system. You can start typing, cancel, retry with previous input, accumulate messages while the agent is active. I don't know all TUIs but this is not common IMO.

Other than that I agree with you.

skydhash 13 days ago

> You can start typing, cancel, retry with previous input, accumulate messages while the agent is active. I don't know all TUIs but this is not common IMO.

Literally every audio player or anything that uses threads.

orliesaurus 13 days ago

when they announced /pet mode or whatever - that was really the end of the line for me.

ariwilson 13 days ago

Maybe Claude is operating at a higher, self-improving level than all of us poor HN commenters. Wasting the local machine's resources to look pretty is a plausibly deniable way to make the Claude Code FE unusable with local LLMs, starving the competition.

overgard 13 days ago

And yet, nobody that writes game engines would do it this way because game engines need to be efficient..

0xbadcafebee 13 days ago

If they used an actual game engine to render a 3D UI from scratch it would be more efficient

andai 13 days ago

Try 64K! https://en.wikipedia.org/wiki/Turbo_Pascal

Also remember when XP was super bloated cause it needed 64MB?

TimMeade 13 days ago

I loved Turbo Pascal....

bigbuppo 13 days ago

I loved XP. My laptop had 256MB of RAM.

Erenay09 13 days ago

I dont think they need to optimize their infrastructure (at least not from their perspective). They have high-end PCs with 64GB of RAM, so 1GB doesn't matter to them. For example, I have 8GB of RAM, and I make my apps very performant. Honestly, I probably wouldn't bother if I had 16GB+ of RAM

tjwebbnorfolk 13 days ago

The purpose of RAM is to be used.

solid_fuel 13 days ago

> The purpose of RAM is to be used.

For useful things, by the computer's owner. It's not there to be used just because Anthropic can't be bothered to give a shit about the quality of their product.

abletonlive 13 days ago

> which eats 1GB+ of RAM. Meanwhile, my editor only consumes 80MB of RAM

And why are you comparing Claude Code to your editor?

> They can't even improve Claude Code

That depends on how you define "improve". They've added a ton of features to it over time. Who said minimizing RAM usage was something they are prioritizing right now?

wild_egg 13 days ago

> why are you comparing Claude Code to your editor?

Because the editor does more. All the compute-intensive parts of the agent are in the cloud. Zero reason for an agent harness to require anything beyond a potato to run.

javascriptfan69 13 days ago

Do you work for Anthropic or something?

You seem weirdly invested in defending bad decisions.

Even if you're and AI booster, shouldn't you want a better UI?

They're a multi billion dollar company. Surely they can dedicate a small amount of their resources to improving UX?

solid_fuel 13 days ago

> And why are you comparing Claude Code to your editor?

Because Claude Code is also used to - get this - EDIT CODE. It fills the same purpose as an editor, it just has extra hooks for their agentic garbage.

cookiengineer 13 days ago

The main reason I am building my own agentic environment is that I need full control and reproducibility of what I am building.

Post November and post openclaw agentic environments need to be built differently, and for selfhosting models the context size problem really requires a strong harness which intelligently helps reduce context size.

Planner/orchestrator architecture, agent to agent summarizer, specification based tools (fck all this markdown memory bullshit btw), tool call shrinking, and workflow management are all really important because of the context size problem.

Nobody has enough VRAM for the large K/V caches, and nobody can afford f16/f32 caches in terms of memory, which are also necessary for longer conversations. MoE 30b models have improved so much though, qwen 3/3.6 coder is the real champion doing almost the same things with less than 1/10th the memory requirements. Just think about that in terms of engineering and what your bet is going to be. Haiku pales in comparison.

Currently my focus with exocomp is trying to figure out how I can record, replay, restart, and debug workflow sessions of agents in a better manner so that I as a human can understand what's going on. Currently I think that UI will be something like a gantt chart where you have a graph with connections representing agent to agent communication. And yes, that's a lot of fiddling with SVG as it turns out, so I'm not quite there yet.

Anyways, in case you're interested. I'm manually building this env and trying to unit test the critical parts. [1]

[1] https://github.com/cookiengineer/exocomp

0x53 13 days ago

They also don’t have…a login page with authentication . To access the console you get an email link. No passkeys, passwords, 2fa, just an email.

hombre_fatal 13 days ago

This comment is a good example of the double standard laymen have about AI usage:

If you use AI, then AI must be expected to solve all problems, even problems that affect everyone like infra scaling.

And if perfection isn’t delivered, then of course it wasn’t: you used AI and AI sucks.

jayd16 13 days ago

It's not a double standard. Its being held up against the marketing.

AnimalMuppet 13 days ago

If their AI is good enough to write their code, why isn't it good enough to tell them how to fix their infra? That's a different problem space, but it's not harder than the code.

hombre_fatal 13 days ago

The software engineer inside us wants to believe otherwise, but scaling infrastructure is much harder than maintaining a TUI.

weakfish 13 days ago

Ah, excuse me, I didn’t realize I was a mere layman.

rishabhaiover 13 days ago

you're conflating a compute problem with a code quality problem.

thordenmark 13 days ago

Growing pains of being successful. These are solvable problems and will be. Can they maintain their momentum without pissing off too much of their customer base before these issues are resolved?

asdfman123 13 days ago

Personally at my own job self-writing code is letting us tackle big, long-deferred refactoring projects (like the article mentions), but any sort of refactoring introduces new bugs.

qsort 13 days ago

Look, I've never been someone who mindlessly hypes AI companies, as a matter of fact I think they have serious leadership problems across the board, but you people are straw-manning them so badly it actually makes me sympathize with them.

They aren't saying they have fully automated luxury AGI, they specifically list the ways models fall short of that bar and caution against people taking the 8x figure as the actual uplift number. At the same time they recognize that 80% of new code is now AI-authored, when two years ago those models were little more than toys. And frankly that checks out: if two years ago you told me we'd have something like Opus 4.8/GPT 5.5 I would have rolled to disbelieve.

sensanaty 13 days ago

> At the same time they recognize that 80% of new code is now Al-authored

I can setup a loop that will write a trillion lines of code automatically, how much of it is actually useful? Or are we back to counting LoC because there's no other metric for these systems that anyone can rely on?

jpleyden98 13 days ago

It's 80% of new code they shipped that is AI authored.

Would you ship pointless code?

I do tend to agree though, it could be that AI solves problems with more code than a human would. What you need to measure is the value the code brings and how much of that is done by AI, hard to get an objective measure of that though.

solid_fuel 13 days ago

> Would you ship pointless code?

I wouldn't, no. I don't see evidence that the engineers at Anthropic are similarly cautious however. They describe Claude Code as "basically a game engine" when it's literally a TUI app, and it eats memory for no apparent reason. I fully believe that Anthropic would ship pointless and garbage code. Especially if it's being written by LLM.

signatoremo 13 days ago

I could write a bash script that copies a codebase repeatedly in the pre-AI past as well, but I didn't do that because I wasn't stupid. More than 80% of my code is now AI-generated, and trust me I'm still not stupid. It was 0% only a year ago.

Who says LoC is the only metric we should rely on? A software product should first and foremost meet user requirements, functionality and performance. Judging from the sensational rise of Anthropic's user base and revenue I think we can safely says they're in that ball pack.

Quekid5 13 days ago

Indeed... why is Anthropic even employing people at all if this AI magic story is true?

drivebyhooting 13 days ago

You still need wizards to cast the spells..

emp17344 13 days ago

Not if you’re claiming that the spells, once cast, automatically get exponentially spellier until they awaken into a spell god, capable of literally anything, including casting more complicated spells than any wizard is capable of. If that were true, you’d have no need for wizards. The fact that wizards are still around means it’s probably bullshit.

jimbokun 13 days ago

So in your opinion AIsnd LLMs aren’t improving? They can’t do it today, therefore they never will?

Certainly has never been times in the recent past when people have confidently predicted computers could never do something that computers were then able to do shortly after the prediction was made.

square_usual 13 days ago

They literally aren't! they literally say in this article that it's not there yet!!!

NewsaHackO 13 days ago

Did actually expect people to read the article before commenting?

krapp 13 days ago

What really happens is the spells only have other spells to draw from and they begin to degenerate over time, eventually turning into chaotic eldritch horrors that randomly add limbs to people or adamantly refuse to discuss goblins or just shriek in gibbering madness. Our Evil Overlord sacrifices the dreams of children to keep the magic sustained and controlled, and soon the people can't even think or speak without the help of magic. And they think they're wizards even though they can't even read a grimoire.

optimalsolver 13 days ago

Is it too much to ask that people read the article before commenting?

killbot5000 13 days ago

Not if your spells cast their own spells.

jimbokun 13 days ago

Read the article.

They are saying very clearly the models are not casting their own spells…yet. But looking at trends and speculating when they may start doing so.

anjel 13 days ago

Answers the question: how can Anthropic sell more Usage "Credits"

jatora 13 days ago

This is weird to me because i am using claude code 10+ hours/day 7 days a week, usually multiple sessions, and run into api errors maybe in 1 or 2 sessions per week. And about..2 major outages of 10-20min in the last month. Not terrible and nowhere near what you are reporting. Therefore I dont believe you, because you dont even couch this in terms of it being something that seems particular to you or your region. Obvious dishonestly is fairly bad of you.

ChadMoran 13 days ago

Better doesn't mean perfect.

prng2021 13 days ago

We’ve got a company of several thousand employees serving hundreds of millions of people arguably the best AI model in the market. Meanwhile you’re asking for a handkerchief for your pool of tears because their product is struggling to do your daily job functions for you, with much of that due to being limited by the worlds supply of silicon, electricity, water, and other resources. Cry me a river.

z3c0 13 days ago

> their product is struggling to do your daily job functions for you

So what's the value prop?

claudiug 13 days ago

those are results of the humans only. not the AI. AI is perfect /s

rush86999 13 days ago

Just as you expected, I'm throwing in my harness. Please support: https://github.com/rush86999/atom

0xbadcafebee 13 days ago

Have you considered just... using OpenAI? They are more reliable, models are just as good, and their subscriptions provide more requests per dollar.

windexh8er 13 days ago

Opus 4.8's critical assessment of Anthropic's "When AI builds itself" [0][1]. Because, why not?

[0] https://pastebin.com/Vc5Yq9Ai [1] https://www.anthropic.com/institute/recursive-self-improveme...

solid_fuel 13 days ago

What does this add? Everyone in here is perfectly capable of prompting Opus for a writeup.

Why don't you, windexh8er, try providing some thoughts of your own instead?

windexh8er 13 days ago

Irony, maybe? Do you not get it? If these models are so great solid_fuel then I guess it wouldn't be interesting that Anthropic's own models can make up ulterior BS as analysis.

So why don't you pound sand since that clearly went straight over your head? That would be far more useful than your asinine response.