|
|
|
|
|
by iLoveOncall
49 days ago
|
|
This is just a hallucinations benchmark on a subset of outputs, not sure there's a value over general hallucinations benchmarks? > Our goal is to be the best general model for deterministic tasks I'm sorry but this simply doesn't make sense. If you want a deterministic output don't use an LLM. |
|
I am hopeful deterministic output will return, though; DeepSeek v4 claims to have implemented "bitwise batch-invariant and deterministic kernels," though I haven't tested it myself.