|
|
|
|
|
by time0ut
32 days ago
|
|
I ran through the eval loop for a side project’s task (personalization of a micro video game, no thinking) last night. Head to head with Gemini 3 Flash Preview, results came out at basically a wash on my rubric. The output quality was good, well grounded, and reliable across 144 runs. But not noticeably better. It isn’t a traditional coding task, so can’t infer anything there. The amazing part was how fast it is. It was consistently about 2x faster than 3 Flash Preview and slightly faster than 3.1 Flash Lite Preview which is amazing. For my task, the price difference doesn’t matter, so easy upgrade. I plan to write up a quick blog post with the results over the weekend. |
|