Hacker News new | ask | show | jobs
by ossa-ma 117 days ago
tldr: We took our hypertuned coding agent trained it on millions of internal data engineering workflows and data, with specialized custom-built tools, and it only managed to complete 3 more tasks than Claude Code (out of 43) on a super niche domain-specific benchmark.