|
|
|
|
|
by vlade11115
326 days ago
|
|
I love the site design. > There's an obvious question looming here — if the models got so confused, how did they consistently pass the reconciliation checks we described above? It may seem like the ability to make forward progress is a good proxy for task understanding and skill, but this isn't necessarily the case. There are ways to hack the validation check – inventing false transactions or pulling in unrelated ones to make the numbers add up. This is hilarious. I wonder if someone is unintentionally committing fraud by blindly trusting LLMs with accounting.
Or even worse, I bet that some governments are already trying to use LLMs to make accounting validators. My government sure wants to shove LLMs into digital government services. |
|