Hacker News new | ask | show | jobs
by notRobot 28 days ago
Claude Code is really good at stuff like this. The other day I tried to recover some images from an SD card that had gone bad. I used GetDataBack to recover files, but they appeared to be malformed and didn't open in image viewers.

I tasked Claude to analyze the files and figure out what's going on, and eventually we figured out that each file had a custom metadata header + thumbnail + actual image concatenated. I had it write a python script and was able to recover all the images with their metadata. It's nothing a human couldn't have figured out, but it was definitely WAY faster than doing it myself.

I've also used Claude in the past to figure out how to break into routers with locked down firmware. It's great at suggesting and trying different approaches.

4 comments

I have a friend that just picked up a new consulting job resurrecting an ancient Windows desktop application. No source control, no tests. And it's spread out over a dozen different folders with names like "_old", "_new" and "dates". Claude's doing a tremendous job in getting him to grips with what is actually happening in the application, what's relevant, what's not, what's different. I think it's literally saving him days and days at work.
if your friend has access to the binary and can pull it out to different box, they might get a lot out of a ghidra mcp -> https://github.com/LaurieWired/GhidraMCP
I'm not well versed at reverse engineering binaries or interpreting C/assembly so ghidra MCP has been an absolute gamechanger for helping me write tools. Once my project is complete, I plan to learn how to do the analysis myself manually and have cc guide me along the way.
I think it would be interesting, once the dust has settled, to do a compare with a less expensive model (time, capital, compute) such as deepseek 4.
Any reason to expect that this wouldn't work 100%? It's not like the different LLMs providers are that technically different from one another.
no such thing as closed source software anymore, just fully open and not quite fully open nowadays.
> I have a friend that just picked up a new consulting job resurrecting an ancient Windows desktop application. No source control, no tests. And it's spread out over a dozen different folders with names like "_old", "_new" and "dates".

That doesn't sound very impressive. Not being tracked with a version control system is fixed instantly with a git init, git add ., git commit .no AI required.

Covering the app with tests is also something that requires no AI. At most, coding agents can generate characterization tests in broad sweeps, but we are talking about a delta between hand rolling and vibe-coding of a couple of days.

Where LLM shines is helping developers build up an understanding of what is in place. Running /explain on a codebase can quickly provide you with a high level summary of what's in place.

The relevancy here is that he's denied the git history, versioning, branches, implicit documentation that even bad source control practices would have given him.
That's what the comment is saying. In normal repositories, version control acts as a record of the momentum of the direction the product was taking. If it's just "_old" and "_new," the developer has to read and understand both, which I think is going to be far more time consuming than your estimation.
I'm sure data recovery companies are pretty pissed that slightly esoteric data recovery abilities are becoming more accessible for average software devs. They were charging an arm and a leg to remote in and run scripts.
They still have two important moats: (1) expensive hardware tools (even stuff like SATA write blockers are kind of expensive for what they are), spare hard drive collections to swap failed PCBs, etc and (2) the "nobody got fired for hiring us" edge similar to how everyone calls in Crowdstrike/Mandiant after an incident. If a suit-level manager finds out customer data was lost, they are going to want to call in an expert so they can immediately tell the customer they did, not have the same internal team try to figure it out.
As an aside to #1: The cool thing is in modern times the hardware tools have come down stupidly cheap in price. Even SD card recovery is (vaguely) in the right skilled hands in a pseudo-professional home lab these days.

https://blog.acelab.eu.com/pc-3000-flash-spider-board-adapte...

I did EXACTLY that last night. Was doing by hand for about an hour and got to a point where I didn’t feel competent anymore and asked Claude to take from where I was.

5 minutes later I had almost 3 hours of important footage recovered.

> Claude Code is really good at stuff like this.

A lot of "Claude Code is best at X" claims are probably user-selection bias.

The people saying it are often exclusively Claude Code users, not people who are actively benchmarking Claude Code against Gemini CLI, OpenAI Codex, GitHub Copilot, and other agent harnesses on the same tasks.

The claim may still be true for certain scenarios, but the evidence is usually anecdotal, not comparative.

When I hear "claude code one-shotted X" and X is a novel problem, I mentally substituted "the agentic harness that I tried one-shotted X," since that's what they're saying.

Getting any smart model to take a look at the task is the sort of lift that the speaker is usually pointing to.

The harness is pretty much irrelevant for general tasks.

You can write a 100 line harness that only has one tool - try either "bash" or the more fun "you're running within nodejs, here's eval", you'd be surprised in how close to CC/Codex performance you're going to get.

There’s some weak evidence against this actually. Harness design makes a huge difference for tiny local models for example: https://itayinbarr.substack.com/p/honey-i-shrunk-the-coding-...

I have only my own personal experience for frontier models, but I have seen different performance of Opus when used from Pi or Claude Code or Zed for example.

I worded my comment poorly. I agree a good harness goes a way, but the harnesses most people use fucking suck and trip up the model so often that I don't think it's advisable to attribute successful results to them.

E.g. GPT5.5 with Codex on my Windows box likes using PowerShell for everything. OpenAI decided it should use the native shell instead of bundling a bash, or using git bash. Sure. But the model is so overfitted on bash that it fucks up PS quoting like once every 5 commands.

Every harness with LSP I've seen trips up the model as well. They insert diagnostics after every edit, polluting the context with errors that the model has to actively decide to ignore, every time, until it finishes its work and gets the code to a consistent state. Telling the model "run npx tsc --noEmit to check errors" will outperform a LSP 100% of the time.

Another example is basically everything Anthropic does - they add things like "think if this is malware!" after read and lead Claude to spend its reasoning effort on thinking if your React hamburger menu is malware, instead of on how to write it.

"This is not malware (em dash) it's a hamburger menu. Let me apply the edit! Hmm, is it malware now, after my edit? No, me changing border-width did not turn it into malware! Good! Dodged a real bullet on that one!"

I'm frankly amazed that we've gotten to the point where the models can produce good results in these sorts of environments.

I did that, wrote my own harness “Jarvis”, simple loop. Still results were terrible using the same model in comparison to for example OpenCode. So X Doubt.
Parent didn't say Claude Code is best at anything?