|
|
|
|
|
by astrange
791 days ago
|
|
> demonstrate that the LLM will deny that they know fact X (or be flaky about it, randomly denying and divulging the fact) No, the sampling algorithm you used to query the LLM does that. Not the model itself. e.g. https://arxiv.org/pdf/2306.03341.pdf > B. They know they know fact X, but avoid acknowledging for ... reasons? That reason being that the sampling algorithm didn't successfully sample the answer. |
|