|
|
|
|
|
by simianwords
104 days ago
|
|
> The complete failure of Claude to play Pokemon, something a small child can do with zero prior instruction cherry picking because gemini and gpt have beat it. claude doesn't have a good vision set up > The "how many r's are in strawberry" question it could do this since 2024 > The "should I drive or walk to the car wash" question the SOTA models get it right with reasoning > fact that right now, today all models are very frequently turning out code that uses APIs that don't exist, syntax that doesn't exist, or basic logic failures. not when you use a harness. even humans can't write code that works in first attempt. |
|