| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by syllogism 1006 days ago
	The gold standard they're comparing against was done by humans though. And a task-specific model trained on that data will be better at that task than GPT-4. What's definitely true is that getting decent data often takes some care, especially in how you define the task. And mechanical turk is often especially tricky to use well.