|
|
|
|
|
by gkk
865 days ago
|
|
Hi hansonkd, I'm working on Hotseat - a legal Q&A service where we put regulations in a hot seat and let people ask sophisticated questions. My experience aligns with your comment that vanilla GPT often performs poorly when answering questions about documents. However, if you combine focused effort on squeezing GPT's performance with product design, you can go pretty far. I wonder if you have written about specific failure modes you've seen in answering qs from documents? I'd love to check whether Hotseat is handling them well. If you'r curious, I've written about some of the design choices we've made on our way to creating a compelling product experience: https://gkk.dev/posts/the-anatomy-of-hotseats-ai/ |
|
Specific failure modes can be something as simple as extraction of beneficiary information from a Trust document. Sometimes it works, but a lot of times it doesn't even with startups with AI products specific to extracting information from documents. For example it will have an incomplete list of beneficiaries, or if there are contingent beneficiaries, it won't know what to do. Not even a hard question about the contingency. Just making a simple list with percentages of if no-one dies what is the distribution.
Further trying to get an AI to describe the contingency is a crap shoot.
While I expect these options to get better and better, I have fun trying them out and seeing what basic thing will break. :)