|
|
|
|
|
by pcwelder
122 days ago
|
|
It's live on openrouter now. In my personal benchmark it's bad. So far the benchmark has been a really good indicator of instruction following and agentic behaviour in general. To those who are curious, the benchmark is just the ability of model to follow a custom tool calling format. I ask it to using coding tasks using chat.md [1] + mcps. And so far it's just not able to follow it at all. [1] https://github.com/rusiaaman/chat.md |
|
I'm developing a personal text editor with vim keybindings and paused work because I couldn't think of a good interface that felt right. This could be it.
I think I'll update my editor to do something like this but with intelligent "collapsing" of extra text to reduce visual noise.