|
|
|
|
|
by pona-a
488 days ago
|
|
I tested it on basic long addition problems. It frequently misplaced the decimal signs, used unnecessary reasoning tokens (like restating previously done steps) and overall seemed only marginally more reliable than the base DeepSeek 1.5B. On my own pet eval, writing a fast Fibonacci algorithm in Scheme, it actually performed much worse. It took a much longer tangent before arriving at fast doubling algorithm, but then completely forgot how to even write S-expressions, proceeding to instead imagine Scheme uses a Python-like syntax while babbling about tail recursion. |
|
This model was trained on math problems datasets only, it seems. It makes sense that it's not any better at programming.