|
|
|
|
|
by baobun
766 days ago
|
|
Check out Rhasspy. You won't get anything practically useful running LLMs on the 4B but you also don't strictly need LLM-based models. In the Rhasspy community, a common pattern is to do (cheap and lightweight) wake-word detection locally on mic-attached satellites (here 4B should be sufficient) and then stream the actual recording (more computational resources for better results) over the local network to a central hub. |
|
This frustrates me. I ran Dragon Dictate on a 200MHz PC in the 1990s. Now that wasn't top quality, but it should have been good enough for voice assistants. I expect at least that quality on-device with an R-Pi today if not better.
IMHO the end game is on-device speech recognition and anything streaming audio somewhere else for processing is delaying this result.