| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by noahkay13 108 days ago
	I built a C++ inference engine for NVIDIA's Parakeet speech recognition models using Axiom(https://github.com/Frikallo/axiom) my tensor library. What it does: - Runs 7 model families: offline transcription (CTC, RNNT, TDT, TDT-CTC), streaming (EOU, Nemotron), and speaker diarization (Sortformer) - Word-level timestamps - Streaming transcription from microphone input - Speaker diarization detecting up to 4 speakers

2 comments

aaronbrethorst 108 days ago

I see a number of references to macOS support in your docs for Axiom. Can this run on iOS?

link

noahkay13 108 days ago

Theoretically, yes? This hasent been tested but xcode has great c++ interop and the goal with Axiom and now parakeet.cpp is to be used for portable deployments so making that process easier is definitely on the roadmap.

link

computerex 108 days ago

Oh hey I just implemented this in golang. Mine implementation heavily optimized for cpu.

link

pdyc 108 days ago

can you share your repo.

link