|
|
|
|
|
by Der_Einzige
6 days ago
|
|
This exact mentality is cancer for peer review/the industry. We all know who you are if you are using 1000+ TPUs, and yes you do get a "buff" to your peer review scores because people know where you work. Fuck your scaling curves. More research labs need to #yolo and try stuff that doesn't have good scaling behavior proven yet. State Space models have continued to take forever to proliferate despite being objectively good because only the god dang Chinese understand that you actually need to #yolo sometimes like making some of your layer state space layers in Hunyuan-T1. |
|