| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nihit-desai 1100 days ago
	I mean, sure. For ground truth, we are using the labels that are part of the original dataset: * https://huggingface.co/datasets/banking77 * https://huggingface.co/datasets/lex_glue/viewer/ledgar/train * https://huggingface.co/datasets/squad_v2 ... (exhaustive set of links at the end of the report). Is there some noise in these labels? Sure! But the relative performance with respect to these is still a valid evaluation

1 comments

Agreed, thanks for highlighting these links!