|
|
|
|
|
by thomasahle
1038 days ago
|
|
Exactly. This is id always a pitfall when benchmarking LLM based techniques. The enwiki8 dataset they use, for example, is for sure in the training data. To know how the method performs on novel data, the authors have to come up with entirely new datasets, since anything already existing must be assumed probably contaminated. |
|