Skip to content

Arm and Google Optimize AI Inference on Devices

In a nutshell: Arm SME2 and Google AI Edge enable faster AI models on mobile devices. With direct hardware integration, developers achieve five-fold faster inference without specialized accelerators.

Arm and Google present a new solution for efficient artificial intelligence directly on end devices. Through Arm Scalable Matrix Extension 2 (SME2), complex AI models can be executed up to five times faster without requiring specialized accelerators.

Artificial intelligence is evolving rapidly – from simple text exchange to sophisticated multimodal applications that enable image generation and audio production directly on the user device. This opens up new possibilities for developers to create personalized application experiences. Until now, deploying large AI models on end devices involved a trade-off: either developers had to accept slow CPU computations or resort to fragmented, specialized accelerators.

The new Arm Scalable Matrix Extension 2 (SME2) solves this dilemma through an innovative architecture. A dedicated matrix compute unit is integrated directly into the CPU cluster, allowing processors themselves to function as powerful AI accelerators. This solution enables up to five times faster inference for the matrix-intensive tasks that power generative AI.

Google AI Edge offers a comprehensive, integrated software package that significantly simplifies the development of AI applications on Arm-based systems. The LiteRT tool automatically leverages Arm SME2 at runtime by integrating with XNNPACK and Arm KleidiAI.

Share on: