|
|
|
|
|
by thot_experiment
28 days ago
|
|
Tried this extensively in OpenCode, never used it once since Gemma 4 came out, got into thought loops and did stupid edits I didn't ask for more often than the local 31b model. One of the worst "frontier" models I've ever tried. |
|
Training on ~1B tokens on 8xB300 and the first checkpoint halfway in learned really well. Tencent might be struggling with agentic work, but the base knowledge is there.