|
|
|
|
|
by stingraycharles
106 days ago
|
|
I’m a bit confused by what you’re offering. Is it a voice assistant / AI as described on your GitHub? Or is it more general purpose / LLM ? How does the RAG fit in, a voice-to-RAG seems a bit random as a feature? I don’t mean to come across as dismissive, I’m genuinely confused as to what you’re offering. |
|
Right now, our focus is Apple Silicon.
Today there are two parts:
MetalRT - our proprietary inference engine for Apple Silicon. It speeds up local LLM, speech-to-text, and text-to-speech workloads. We’re expanding model coverage over time, with more modalities and broader support coming next.
RCLI - our open-source CLI that shows this in practice. You can talk to your Mac, query local docs, and trigger actions, all fully on-device.
So the simplest way to think about us is: we’re building the runtime / infrastructure layer for on-device AI, and RCLI is one example of what that enables.
Longer term, we want to bring the same approach to more chips and device types, not just Apple Silicon.
For people asking whether the speedups are real, we’ve published our benchmark methodology and results here: LLM: https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-e... Speech: https://www.runanywhere.ai/blog/metalrt-speech-fastest-stt-t...