Hacker News new | ask | show | jobs
by KhoomeiK 762 days ago
We're hoping to release an evals paper about Bananalyzer this summer and compare Tarsier to a variety of other perception systems in it. The hard part with evaluating a perception/context system though is that it's very intertwined with the agent's architecture, and that's not something we're comfortable fully open-sourcing yet. We'll have to think of interesting ways to decouple the perception system and eval them with Bananalyzer.