| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lionkor 756 days ago
	I found that, compared to GPT-3.5, it refuses to shut up when told to shut up. In the middle of a conversation, try going "SHUT UP, STOP TALKING ALREADY". For me, it just keeps repeating the last output. Very cool.

2 comments

mlyle 756 days ago

Yes: GPT-4 turbo could receive a meaningful correction and generally change its answer in that direction. GPT-4o is very, very resistant to doing this and will tend to parrot the previous answer, even after admitting it was in error.

I routinely fix this by toggling from GPT-4o to GPT-4t.

link

emsign 756 days ago

They still haven't fixed that? lmao

Having to constantly correct incorrect answers by LLMs only for them to apologize and give another incorrect answer is what made me lose complete interest in using them.

I figured if I'm knowledgable enough to correct LLMs it's more efficient to not use them at all. What's the point really? Am I teaching them? Because I felt like a teacher who is quizzing a student who keeps on guessing but failing.

link

mlyle 755 days ago

I don't use LLMs to answer general problems for correctness. I use them for text formatting and rewriting superpowers. GPT-4t does a good job if I need it to iterate and change slightly what it does.

For example, to inform the University of California about the content of my courses, I have to go through a course articulation which is several pages long, is written in a formal academic voice, and is pretty time consuming to create. GPT-4t can take my informal course outline and an example of a past articulation that I've written and do the job to a point where I just need to ask it to make small changes for 10 minutes and then make a last couple edits myself. I turn a couple of hours to 10 minutes and 25 cents of API calls.

(Also, sometimes when it's explaining example assignments, it thinks of nice things to include that I hadn't planned on, and I end up shamelessly using them; other times it thinks of garbage and I have to coax it to articulate what I actually meant).

I'd say GPT-4o is slightly better at the task... except it commits so strongly to its answers in the context buffer that it doesn't do effective rewrites/corrections. So I've settled into a workflow of using GPT-4o to do initial work and then use GPT-4t for the final cleanup.

link

dailykoder 756 days ago

It feels like the answers are getting longer and longer too. Even for the most basic questions, which could be answered with 2 sentences. Does it have ADHD? Who wants to read all these wall of text?

link

gmerc 756 days ago

Well it’s paid by token

link

emsign 756 days ago

Oh. Oooh! Yes. And "I don't know." aren't a lot of tokens. So where's the incentive there, lol.

link

dailykoder 756 days ago

But even the gpt3.5 answers were getting longer and longer. I don't know, I don't pay for it myself, we just have 4o at cagie and I don't know how that's different in terms of the tuning compared to the "normal" 4o

link