Hacker News new | ask | show | jobs
by leedrake5 974 days ago
Downloading this now to test it out - if anyone has prompts or tests they’d like to see let me know. I plan to run it with llama.cpp
2 comments

Hey, author here.

You can check out our evaluation library (https://github.com/wellecks/lm-evaluation-harness) for the exact benchmark implementations we used, including prompting.

In particular, the prompt that starts at line 27 in this file (https://github.com/wellecks/lm-evaluation-harness/blob/maste...) is quite good for high school/olympiad problems. We took this prompt from Google's Minerva paper.

Thanks! First thank you for all the work to develop a math LLM. Those prompts are helpful. I shrank the weights down to f16 to fit on my system from f32 using the procedure at ggerganov/llama.cpp, then tried some of those prompts. From line 34:

> Problem: > If $\det \mathbf{A} = 2$ and $\det \mathbf{B} = 12,$ then find $\det (\mathbf{A \mathbf{B}).$

> Solution: > We know that for a matrix \mathbf{M}, the determinant of its inverse is given by $\frac{1}{\det \mathbf{M}}.$ We also know that the determinant of a product of matrices is equal to the product of their respective determinants. Therefore, we have: \begin{align} \det (\mathbf{A}^{-1}\mathbf{B}) &= \frac{\det(\mathbf{AB})}{\det\mathbf{A}}\\ &= \frac{\det\mathbf{A}}{\det\mathbf{B}} \end{align} > Therefore, we have: > $\det(\mathbf{AB}) = 24

The f32 answer: > We have that $\det (\mathbf{A} \mathbf{B}) = (\det \mathbf{A})(\det \mathbf{B}) = (2)(12) = \boxed{24}.$ > Final Answer: The final answer is $24$. I hope it is correct.

Final answer is the same, which is encouraging for quantization to expand hardware options.

> > Problem: > If $\det \mathbf{A} = 2$ and $\det \mathbf{B} = 12,$ then find $\det (\mathbf{A \mathbf{B}).$

Is that supposed to be missing a "}" after the last "A"?

Possible. I pasted the output, but tried to get some paragraph spacing in the output, possible I accidentally deleted a character.
can you please ask it if algebraic numbers are closed under addition? chatgpt4 just told me no, and then i asked it again and it said yes, and then it apologized profusely for saying no.