| My observation (ChatGPT and not the API models): For code, 3.5 is superior. 3.5 allows for about 21k tokens of input, while ChatGPT 4 allows for around 10k. This also makes it a lot better for boilerplate work as at it can take a lot more input, and handles long conversations and iterations better. Brainstorming, 4 is better. It's capable of some top tier brainstorming and it argues back quite frequently. Unguided creative writing (describe a potato), they're roughly equal. Guided creative writing (i.e. write a story around (400 words of requirements)), 4 is much better. Poems and wordplay, 4 absolutely floors 3.5. Wider vocabulary and it's able to do rhymes and alliterations better, which humans are usually bad at. For reasoning and riddles, 4 is still benchmark among the LLMs. I really dislike that they named it GPT-3.5 instead of something like "Glide". It implies that it's inferior to 4, when they're just suited for different things. |