Y
Hacker News
new
|
ask
|
show
|
jobs
by
scosman
49 days ago
Why so narrowly eval just with/without skill?
Same approach is useful for everything: model, params, prompt, sub-agents, skills, rag, etc?
1 comments
darkrishabh
49 days ago
Then you go in the territory of benchmarking. But I love the idea here. Having standards around those can really help move the needle
link