Hacker News new | ask | show | jobs
by 6gvONxR4sf7o 1740 days ago
> as I release improved models allowing users to speak faster without mistakes, users will speak even faster until there are mistakes again.

LOL. Users will be users! That's a hilarious case study, thanks for sharing.

> Yep! There have been several improvements on editing, though that's more in the user script domain and my work has still been mostly on the backing tech. I'm planning on working on "first party" user scripts in the future where that stuff is more polished too.

That would be wonderful! If you haven't seen them, I'd suggest looking at Serenade (also ASR) and Nebo (handwriting OCR on ipad) as interesting references for editing UI. They seem to have tight integration between the recognition and editing steps, letting errors be painless to fix by exposing alternative recognitions at the click of a button or short command. It lets them make x% precision@n as convenient as x% accuracy.

1 comments

I would say not quite as convenient, because they lean on that UI to also make you constantly confirm top-1 commands that would've worked fine. As you can see in my Conformer demo video I can hit top-1 so reliably I don't even need to wait to look at the command before I start saying the next one.