|
|
|
|
|
by aleksiy123
33 days ago
|
|
It’s also just a useful exercise in general, especially for getting feedback for models and harnesses. I’ve been thinking about setting up a non trivial project to use as a benchmark for any plugins and/or harness changes I make. Having a prebuilt verification suite is great. You can use it to asses things like token usage, time, across different harnesses, models, plugins. |
|