| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Rudybega 247 days ago
	I wonder if you could dramatically improve these results with some relatively simple scaffolding and tool access. If a ton of these mistakes are genuinely simple calculation errors, it seems like giving the models access to a calculator tool would help a fair bit.

3 comments

michaelrbock 233 days ago

We agree, that's the thesis behind our tax development coding agent: https://www.columntax.com/blog/introducing-iris-our-ai-tax-d...

link

Lionga 247 days ago

The problem is they do not understand what/how to calculate not the actual act of adding or multiplying. I tried asking ChatGPT to calculate some taxes for three countries, two of which I have been filing taxes already. For the two I know ChatGPT gave wildly wrong numbers (not even right ballpark), so I know I could not trust numbers for the third which was what I was mostly interested in.

link

sails 247 days ago

I feel like we are already there. I would imagine if you set Claude Code or Codex this task, running in the CLI, you would see a huge improvement, and that is before you start creating task specific guardrails.

I’m surprised they haven’t tried this, I’m running my own in parallel against my accountant in this way.

link