| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rustystump 82 days ago

I wont touch how profoundly i disagree with everything you said on reasoning (u clearly already have it figured out) but a fun test i have done with most of the big models is to give it some text input, maybe a short story, and have it rate it. That is, the prompt is, rate this from 1-10.

For Gemini and gpt, it almost always will give very similar scores for everything. As long as grammar isnt off u cannot get below a 7.

X ai on the other hand will rarely give anything above a 7.

Now when u prompt with, rate 1-10 with 5 being average, all the sudden the scores of openai and gemini drop and x ai remains roughly the same.

All of them will eventually give you a 10 if u keep making tiny edits “fixing” whatever they complain about.

Humans do not do this. Or more specifically, my experience with humans.

1 comments

gjm11 80 days ago

Clearly a bunch of other people also disagree profoundly with everything I said, since my comment is currently sitting at 0 having at one point been higher.

I vigorously encourage anyone who thinks something I wrote is bad to downvote it as they see fit, but it would be nice if some of those people would tell me what about my comment they found so objectionable. (It all seems pretty well reasoned to me -- but it would, wouldn't it?)

[EDITED to fix an inconsequential typo]

link

rustystump 79 days ago

I never downvote. It isnt worth dwelling on it. People who downvote are usually the types who will not have a constructive discussion with anyways.

To be specific, one is text wall the other is that i disagree with the majority of what u said but sadly dont have the time to outline all of it. At the end of the day my disagreement is only an opinion so worth little.

link

gjm11 79 days ago

For the avoidance of doubt, I wasn't meaning to imply that you downvoted me. (Nor do I mind if you did.) I don't think it's true that people who downvote things are never able to have a constructive discussion, but there's probably some correlation there.

Anyway, thanks for giving some indication of what you didn't like.

link