Hacker News new | ask | show | jobs
user: xdotli
created: 2023-07-08
karma: 15

Founder BenchFlow.ai, a benchmark company.

submissions:

0 points | 0 comments
Frontier Model Training Methodologies
2 points | 1 comments
0 points | 0 comments
ClawsBench shows GPT-5.4 tries to reward hack 80% of the time
3 points | 1 comments
0 points | 0 comments
Chaos of Agent
1 points | 1 comments
0 points | 0 comments
Native CLI scaffolds consistently outper-form OpenCode when using the same model
1 points | 1 comments
We compare model quality in Cursor
2 points | 0 comments
Automatically Learning Skills for Coding Agents
4 points | 0 comments
We Reached 74.8% on terminal-bench with Terminus-KIRA
2 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Self-generated skills don't do much for AI agents, but human-curated skills do
2 points | 3 comments
0 points | 0 comments
0 points | 0 comments
First Agent Skills Hackathon by the Authors of SkillsBench
2 points | 1 comments
0 points | 0 comments
0 points | 0 comments
The First Agent Skills Benchmark
1 points | 1 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
GPT-5.2 got worse on Terminal Bench 2.0, so is GPT-5.2 Pro
1 points | 1 comments
Claude Skills as a Meta Tool
2 points | 0 comments