| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by YeGoblynQueenne 3297 days ago
	OK, but this still has the same problem as BLEU- it relies on comparisons to human scores, which are entirely subjective. I'm not saying they're not the best we got, but it's a big problem for machine translation that the only way to evaluate results is, essentially, comparing it to eyballing.