| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by x1000 454 days ago
	If they had experimented using a newer model (gemma 3, deepseek-1 7b, etc.) and reported better results, would that be because their newer baseline model was better than the llama 2 model used in the previous methods' experiments? A more comprehensive study would include results for as many baseline models as possible. But there are likely other researchers in the lab all waiting to use those expensive GPUs for their experiments as well.

1 comments

josephg 453 days ago

Sure. But papers take a really long time to write and go through peer review. I think my paper on collaborative editing took about 4 months from the point where we were done writing to the point at which it appeared on arxiv.

This research was almost certainly done well before Gemma 3 and Deepseek were released.

link