Hacker News new | ask | show | jobs
by lanstin 501 days ago
Yeah they aren't that good at spotting wrong questions (tho better than a year ago). Claude is especially likely to do this correctly. GPT o whatever will do this push back wrongly. Something in Gemini is positioning Gemini as an all knowing expert rather than a tool for exploring new true statements. It ends almost every thing with "is there anything else I can explain about the distribution of algebraic numbers with a given height".
1 comments

Actually I just did one of my test questions on GPT o3-mini-high and it got it. Very nice. Back in the lead over Claude. (My last check was with o1; although if they used the whole chat history to train or fine tune I gave the answer and in the end forced o1 to accept it. Lot of arm twisting language tho.)