Hacker News new | ask | show | jobs
by DeathArrow 594 days ago
>My best example is test cases. I can write a method in 3 minutes, but Sonnet will write the 8 best test cases in 4 seconds

For me it doesn't work. Generated tests fail to run or they fail.

I work in large C# codebases and in each file I have lots of injected dependencies. I have one public method which can call lots of private methods in the same class.

AI either doesn't properly mock the dependencies, either ignores what happens in the private methods.

If I take a lot of time guiding it where to look, it can generate unit tests that pass. But it takes longer than if I write the unit tests myself.

2 comments

For me it's the same. It's usually just some hallucinated garbage. All of these LLM's don't have the full picture of my project.

When I can give them isolated tasks like convert X to Y, create a foo that does bar it's excellent, but for unit testing? Not even going to try anymore. I write 5 unit tests manually that work in the time I write 5 prompts that give me useless stuff that I need to add manually.

Why can't we have a LLM cache for a project just like I have a build cache? Analyze one particular commit on the main branch very expensively, then only calculate the differences from that point. Pretty much like git works, just for your model.

"It's usually just some hallucinated garbage. All of these LLM's don't have the full picture of my project."

Cursor can have whole project in the context, or you can specify specific files that you want.

> Cursor can have whole project in the context

Depends on the size of the project. You can’t shove all of google’s monorepo into an LLMs context (yet)

I’m looking at 150000 lines of Swift divided over some local packages and the main app, excluding external dependencies
Do you have 150000 lines of Swift in YOUR context window?
I know how to find the context I need, being aided by the IDE and compiler. So yes, my context window contains all of the code in my project, even if it's not instantaneous.

It's not that hard to have an idea of what code is defined where in a project, since compilers have been doing that for over half a century. If I'm injecting protocols and mocks into a unit test, it shouldn't be really hard for a computer to figure out their definitions, unless they don't exist yet and I was not clear they should have been created, which would mean that I'm giving the AI the wrong prompt and the error is on my side.

> Why can't we have a LLM cache for a project just like I have a build cache? Analyze one particular commit on the main branch very expensively

It's not just very expensive - it's prohibitively expensive, I think.

With Cursor you can specify which files it reads before starting. Usually have to attached one or two to get an ideal one-shot result.

But yeah, I use it for unit testing, not integration testing.

Ask Cursor to write usage and mocking documentation for the most important injected dependencies, then include that documentation in your context. I’ve got a large tree of such documentation in my docs folder specifically for guiding AI. Cursor’s Notebook feature can bundle together contexts.

I use Cursor to work on a Rust Qt app that uses the main branch of cxx-qt so it’s definitely not in the training data, but Claude figures out how to write correct Rust code based on the included documentation no problem, including the dependency injection I do through QmlEngine.

Sounds interesting, what are you working on?

(Fellow Qt developer)

Same thing: https://news.ycombinator.com/item?id=40740017 :)

Just saw you published your block editor blog post. Look forward to reading it!

Haha, hi again!

Awesome! Would love to hear your thoughts. Any progress on your AI client? I'm intrigued by the so many bindings to Qt. Recently, I got excited about a Mojo binding[1].

[1] https://github.com/rectalogic/mojo-qt