|
|
|
|
|
by skygazer
83 days ago
|
|
Out of curiosity, if you asked for the same text extraction multiple times, each inside fresh contexts, is it likely to fabricate unique quotes each time? And if so, a) might that be a procedure we train humans to do to better understand LLM unreliability, and 2) and instrumentalize the behavior to measure answer overlap with non LLM statistical tools? Also, quote-presence testing/linking against source would seem to be a trivial layer to build on a chat interface, no LLM required. Just highlight and link the longest common strings. |
|