What happened
RunAnywhere, Inc. released RCLI, an on-device voice AI for macOS, integrating a full speech-to-text (STT), large language model (LLM), and text-to-speech (TTS) pipeline locally on Apple Silicon. RCLI offers 43 macOS actions via voice, local Retrieval Augmented Generation (RAG) over documents, and sub-200ms end-to-end latency without cloud dependency or API keys. The system uses MetalRT, a proprietary GPU inference engine built by RunAnywhere, Inc. for Apple Silicon, supporting models like Qwen3 and Whisper, and requires macOS 13+.
Why it matters
On-device AI capabilities expand for macOS users, reducing reliance on cloud services and enhancing data privacy. Platform engineers gain a high-performance local solution, with MetalRT delivering up to 550 tok/s LLM throughput and sub-200ms voice latency. This shifts operational cost models by eliminating API fees and network latency, enabling sensitive data processing directly on user devices. Security architects can now implement AI workflows with reduced external data exposure, particularly on Apple Silicon devices with M3 or later chips, which MetalRT requires.
Subscribe for Weekly Updates
Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.




