| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dazzaji 954 days ago
	Does anybody know if 2008-2009 SAT is in the training set for these models? Assuming so, I’d be especially interested in head-to-head evals on this type of non-code benchmark for problem sets not already in the training data, to see how it performs on fresh situations.