Hacker News new | ask | show | jobs
by woodson 2304 days ago
Are you going to release the speech data you collect at https://speech.talonvoice.com/ or is it proprietary?
1 comments

I don’t consider it proprietary. As per the agreement I specifically ask for an open license so I will be able to release it in the future.

Right now it’s about 5 hours total, which isn’t a ton for actually training on, which is why I haven’t prioritized releasing it and haven’t even trained on it myself yet. I’ve been mostly using it for evaluation so far.

If someone approaches me and says “I have a compelling need for a bit of training data in the form of your prompts” I’ll probably prioritize a release higher.

As another perspective, a majority of the people at this point submitting their voice are already using Talon and just want the engine to be more robust.