Hacker News new | ask | show | jobs
by koljab 402 days ago
All local models: - VAD: Webrtcvad (first fast check) followed by SileroVAD (high compute verification) - Transcription: base.en whisper (CTranslate2) - Turn Detection: KoljaB/SentenceFinishedClassification (selftrained BERT-model) - LLM: hf.co/bartowski/huihui-ai_Mistral-Small-24B-Instruct-2501-abliterated-GGUF:Q4_K_M (easily switchable) - TTS: Coqui XTTSv2, switchable to Kokoro or Orpheus (this one is slower)
1 comments

That's excellent. Really amazing bringing all of these together like this.

Hopefully we get an open weights version of Sesame [1] soon. Keep watching for it, because that'd make a killer addition to your app.

[1] https://www.sesame.com/

That would be absolutely awesome. But I doubt it, since they released a shitty version of that amazing thing they put online. I feel they aren't planning to give us their top model soon.