|
|
|
|
|
by realitysballs
481 days ago
|
|
I believe the outcome of this type of article is actually positive. The ‘SWE-Lancer’ benchmark provides visibility into a more pragmatic assessment of LLM capabilities. Ironically it actually refutes Altman’s claims mentioned in the same article . Hard to replace engineers when you create a benchmark you can’t score decently on. |
|
I think they are trying to frame the narrative; then succeed at it. Let's see. This helps justify OpenAPI's validation and efforts to investors/VC's. After all; IMO without coding as a use case for LLM's AI wouldn't nearly have the same hype/buzz as it does now. Greed (profit) and fear (losing jobs) are a great motivator to keep investment hype and funds coming in.