|
|
|
|
|
by HarHarVeryFunny
496 days ago
|
|
I'd rather wait to get a good response, than get a quick response that is much less useful, and it's the nature of these "reasoning" models that they reason before responding. Yesterday I was comparing DeepSeek-R1 (NVidia hosted version) with both Sonnet 3.5 (regarded by most as most capable coder) and the new Gemini 2.0 flash, and the wait was worth it. I was trying to get all three to create a web page with a horizontally scrolling timeline with associated clickable photos... Gemini got to about 90% success after half a dozen prompts, after which it became a frustrating game of whack-a-mole trying to get it to fix the remaining 10% without introducing new bugs - I gave up after ~30min. Sonnet 3.5 looked promising at first, generating based on a sketch I gave it, but also only got to 90%, then hit daily usage limit after a few attempts to complete the task. DeepSeek-R took a while to generate it, but nailed it on first attempt. |
|
Lets say I ask for some function that calculates some matrix math in python. It will spit out something but I dont like what it did. So I will say, now dont us any calls to that library you pulled in, and also allow for these types of inputs. Add exception handling...
So response time is important since its a conversation, no matter how correct the response is.
When you say deep seek "nailed it on the first attempt" do you mean it was without bugs? Or do you mean it worked how you imagined? Or what exactly?