Skip to content

Arm and Google Accelerate On-Device AI: SME2 Revolutionizes Edge Inference

Bottom Line: Arm SME2 and Google AI Edge enable up to 5x faster AI inference on devices. Arm’s new CPU integration replaces specialized accelerators and significantly simplifies the development of complex, multimodal AI applications.

Arm and Google are optimizing artificial intelligence directly on end devices. With the new Scalable Matrix Extension 2 (SME2), complex AI models run up to five times faster – without the previous trade-off between speed and memory efficiency.

Artificial intelligence is advancing rapidly: while simple text-based applications were long the standard, modern systems now enable multimodal capabilities such as image and audio generation directly on the device. This allows developers to create personalized user experiences.

Until now, complex AI models forced developers to make unsatisfactory compromises: either they accepted slow CPU processing or relied on fragmented, specialized accelerators. Arm changes this with the Scalable Matrix Extension 2 (SME2). This innovation integrates a dedicated matrix computation unit directly into the CPU cluster – making the processor itself a powerful AI accelerator.

Google AI Edge, a fully integrated software stack, significantly simplifies development. LiteRT automatically leverages Arm SME2 at runtime, seamlessly combining XNNPACK and Arm KleidiAI technologies. The result: matrix-intensive workloads for generative AI run up to five times faster – with significantly simplified programming.

Share on: