|
|
|
|
|
by antiloper
243 days ago
|
|
> For example, in the prompt for this experiment, the model is bootstrapped with the correct Form 1040 lines and short instructions as part of its context. Given that only short instructions are in context, I would not have expected even a frontier model to score well on this benchmark. For better results, I'd think that giving the model access to the entire tax code is required (which likely requires RAG due to its sheer size). |
|
That all being said, we agree, which is what we've built with our internal tax coding agent, Iris: https://www.columntax.com/blog/introducing-iris-our-ai-tax-d... (ability to get just the right Tax form context on a per-line basis to turn the tax law into code).