| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Zababa 3 hours ago

That's already better than RTK because you measure task accuracy AND savings! So I'm more confident in this one than in the RTK/caveman/ponytail stuff.

There are still two things that bother me:

1) I don't really know when tilth is called, how it works kind of. Does the model itself select it when it needs it? Do you need to instruct the model to use it?

2) If the model itself chooses to use it, I'd like to have a benchmark of non regressions on tasks where tilth isn't helping, to ensure you made the model + harness + tools as a whole better rather than more specialized ; or be upfront about more specialized and have more details when to use/when to not use.