Hacker News new | ask | show | jobs
by ollin 28 days ago
Specifically it looks like he's planning to extend the ideas from https://github.com/karpathy/autoresearch into a larger effort towards recursive training improvement [1]:

> Excited to welcome Andrej to the Pretraining team! He'll be building a team focused on using Claude to accelerate pretraining research itself. I can’t think of anyone better suited to do it — looking forward to what we build together!

[1] https://x.com/nickevanjoseph/status/2056760504949842219

4 comments

So he's working on the singularity
Am I the only one who wasn’t particularly impressed by AutoResearch? If you looked at what the agent was actually doing, it was just tuning parameters mostly, not really trying different novel approaches.

I couldn’t help myself but consider this mostly a very inefficient variant of hyperparameter optimization, but someone correct me if I’m wrong, I may be looking at this too pessimistic.

I didn't dig into what the actual repository was doing, but personally, I took some inspiration from the idea after reading about it and realizing that I might have been underestimating the ability of LLMs. I put a bit more work into a performance harness I was using locally and just set some agents to brainstorming and they did seem to find some great stuff. So I don't really have a stance one way or another on this specific repo, but the general idea seems like a really good one.
Could you elaborate in specifics how you had been underestimating models? Ypu mean just using more tighter harnessing to make them work in structured agentic eay or something else?
The specific code I was working on, I had a general idea of the sort of performance improvement that would be possible. I just thought that it would be too hard for the models to figure out without a lot of hand-holding.

But it ended up being not "too hard ever", but more like, in 1 out of every 5 tries, the model did in fact manage to get a large refactoring to the point where it improved performance. So once I set it up to try something, use the perf test, see if it worked, if not, throw it away, repeat. Then it started, slowly, finding some useful things.

Just remember that the will do clever but useless things to improve. Like changing the random seed as per autoresearch's hero image. lol! imo, out of the box thinking is needed.
Ever since AlphaEvolve - the idea that if you build a harness which can evaluate solutions and give LLMs a database where they can keep storing their work and then sample from it - they do find non-trivial solutions over time leaning from their own past ideas.

It is the ultimate manifestation of test-time scaling. I think karpathy just popularised it.

Karpathy embedded within an organization is way more impressive than him out on his own with hot takes and little projects. I hope he does great things for Anthropic.
Absolutely, I wasn’t saying that him being at Anthropic wasn’t going to be effective, I just think his little projects wouldn’t be very interesting if his name wasn’t attached to them.
I was impressed that I was able to take the same basic idea and apply it to anything that a Claude could construct a metric for. It's nice being able to just run /autoresearch and speed up your test suites, and shave time off your builds etc.

It's a decent tool to have in the toolbox.

I was trying to look options outside the box (everything is more context or RAG) and been using this approach for about a month with good results. https://github.com/VDP89/fscars

    > Am I the only one who wasn’t particularly impressed by AutoResearch?
isn't it just a nerfed AlphaEvolve? https://arxiv.org/abs/2506.13131
Inefficient variants with $100m+ worth of compute will still probably outperform the best team of researchers
That's not the question. The question is how much you need to give the best team of researchers to beat $100m+ worth of compute. $1m of compute? $10m? Clearly giving the best team $100m is going to beat out giving an efficient group $100m. It does in fact matter who you throw your money at...
I guess we must expect it at this point. But funny that has model written tokens like ’ instead of '
More like he'll blog and tweet about using Claude and get gullible software engineers to buy Claude subscriptions and work on their own obsolescence while paying for it.

Many people are still deluded and think he is the same person who wrote the informal AI tutorials in plain html. He isn't, he is selling stuff now.

I'm as jaded as can be but I think Anthropic is now beyond the point where they'd place much value on farming Karpathy's name recognition. I'm sure they considered it an extra plus in his hiring package but they wouldn't do the level of comp package he'd want if they didn't believe the odds were decent that he'll contribute serious value.

Sure, it can always not work out but that's no more a risk with him than any high-profile hire who doesn't really need the money and will always have other options.

What is he selling? How is this time different compared to when he was at OpenAI or at Tesla? You could say he was shilling those products too. I don't see any shift. He's still posted free in depth YouTube videos recently.
FYI, Karpathy has 2.5M followers on twitter, Anthropic has 1.3M (OpenAI has 4.8M, for comparison). I'm sure Karpathy will be doing mostly research and will make real contributions, but I also think it would be naive to ignore the weight of his voice. It's not negligible, nor is it the only thing he brings.
> What is he selling?

Is that a serious question? He already promoted vibe coding and AI hype. Now he is literally there to promote Anthropic and its IPO price.

When he was at OpenAI it wasn't overtly commercial yet. At Tesla he had a way lower profile. Now he is the vibe coding Jesus for deluded software engineers. The impact is much larger.

> At Tesla he had a way lower profile.

?

He was literally rolled out in front of camera as Tesla's AI prodigy at multiple streamed events designed to appeal to techy consumers and dev recruitment. He's definitely been one of AI's public personas for a long time now, and his employers have regularly aided/directed/utilized him accordingly.

I think he's just genuinely excited about the capabilities.

(I do understand that for Anthropic it's a brand boost as well, just like signing other prominent researchers, as it was with LeCun and Meta etc).