Hacker News new | ask | show | jobs
by fishmaster 1776 days ago
There's the HANS dataset which is used to evaluate NLI models to check if they pick up surface heuristics: https://huggingface.co/datasets/hans https://arxiv.org/abs/1902.01007