LiteRT-LM
by google-ai-edge
About
LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language Models (LLMs) efficiently on edge devices. It achieves peak performance through GPU and NPU accelerators and supports cross-platform deployment on Android, iOS, Web, Desktop, and IoT. The framework also offers multi-modality (vision and audio), tool use (function calling) capabilities, and broad compatibility with models like Gemma, Llama, and Phi-4. It powers on-device generative AI experiences in Google's Chrome, Chromebook Plus, and Pixel Watch.
Features
- Edge LLM Inference
- Cross-Platform Deployment
- Hardware Accelerated (GPU/NPU)
- Multi-modality & Tool Use
- Broad Model Compatibility
Supported Platforms
webmobiledesktopiot