|
|
|
|
|
by orwin
12 days ago
|
|
What probability do you assign to that, especially since CC harness code leaked? Because I used frontier models this weekend (I had 78% of my assigned tokens for this month left, I wanted to burn them before June 1st, ended up with 24% left), and tbh, I don't see much of the improvement compared to the models I use day-to-day. I'd rather pay less for a slightly worse model. Stacktrace analysis (or any bug analysis really) is where LLMs have the most success rate imho, and free models are good enough since last year. As for coding/architecture tasks, frontier models seems to hallucinate less, but I wonder if it's the guardrails or the he model themselves. |
|