| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gavinray 54 days ago

It's really cool to see that other people run into the same issues and arrive at the same conclusions/solution.

At $DAYJOB, we have an LLM-based tool and this issue of "how do we avoid burning tokens solving the same problems over again" was an early obstacle

We wound up building a very similar thing to what you call "tools" (we named them "Saved Programs").

There's a wiki the LLM searches before solving a problem, that links saved programs for past actions to their content entry.

If it finds one, it'll re-use it, otherwise it'll generate a program and offer to save it, if you think it'll be common enough.

3 comments

afshinmeh 53 days ago

> how do we avoid burning tokens solving the same problems over again

Letting the LLM write half baked tools is the recipe for burning more tokens.

> There's a wiki the LLM searches before solving a problem, that links saved programs for past actions to their content entry.

What's the criteria for marking an LLM written tool as useful/correct before publishing it?

link

gavinray 53 days ago

  > Letting the LLM write half baked tools is the recipe for burning more tokens.

It sure is, if the tools are half-baked and your user scale is N=1 rather than N=100 or N=1,000

  > What's the criteria for marking an LLM written tool as useful/correct before publishing it?

It solves the problem the originating user asked it to

link

afshinmeh 53 days ago

> It solves the problem the originating user asked it to

Interesting. And is there a mechanism to go back and "fix" the tools after they are published? What happens if the tool decided to use the "id" attribute to click on buttons and now you have a new website that follows a different pattern to find the right target?

I agree that "correctness" of a tool could have different meaning depending on the context of the problem though (e.g. would you consider OOM a correctness bug even if it addresses the user's ask?)

link

sifar 53 days ago

The problem here is that N different users will ask for N different variants of the same tool, so you'll end up with a tool which is similar but not quite. Is the tool updated to support new functionality, or a new tool is created and you end up with N variants of a tool.

link

Edmond 53 days ago

it's called workflow automation: https://blog.codesolvent.com/2025/12/workflow-automation-let...

Everyone is just taking a round about way to get there. The workflow/program as "tools" approach is the right one. Agents skills are more or less in that same direction.

link

dominotw 53 days ago

there are hundreds or thousands of 'memory' things ppl have been inventing. i am yet to see any proof that these are actually useful or have saved any tokens.

link