Y
Hacker News
new
|
ask
|
show
|
jobs
by
ignoramous
640 days ago
Moshi is CC-BY. Another similar 7b (speech-text real-time conversational) model that was recently released under Apache v2:
https://tincans.ai/slm3
/
https://huggingface.co/collections/tincans-ai/gazelle-v02-65...
1 comments
iandanforth
640 days ago
Important distinction is that tincans is not speech to speech. It uses a separate turn/pause detection model and a text to speech final processing step.
link