|
|
|
|
|
by x1000
454 days ago
|
|
If they had experimented using a newer model (gemma 3, deepseek-1 7b, etc.) and reported better results, would that be because their newer baseline model was better than the llama 2 model used in the previous methods' experiments? A more comprehensive study would include results for as many baseline models as possible. But there are likely other researchers in the lab all waiting to use those expensive GPUs for their experiments as well. |
|
This research was almost certainly done well before Gemma 3 and Deepseek were released.