At a glance: Gemma 4 12B runs on standard laptops with 16 GB RAM and enables local API endpoints via the LiteRT-LM CLI for agent-driven workflows without cloud dependency.
Google delivers Gemma 4 12B — a language model that runs on laptops with 16 GB RAM and enables agent-based workflows entirely locally, without cloud dependency.
Google DeepMind’s Gemma 4 12B model offers multimodal AI capabilities for desktop environments. With 16 GB RAM on macOS systems, it enables local data processing and visual insights generation — such as dynamic Python code execution and data visualization via the Google AI Edge Gallery.
For engineers, this means practical independence from cloud services: the model can run offline and supports speech-to-text functions directly on the machine. Google AI Edge Eloquent enables fully offline voice dictation and text editing without external connectivity.
The LiteRT-LM CLI receives a new “serve” command that provides a local HTTP endpoint. This is industry-compatible and can be used as a backbone for fully locally executed AI agents and automation tools — without dependency on remote APIs.
Source: developers.googleblog.com · Published
Lumi AI News — AI-assisted curation pursuant to Art. 50 EU AI Act. Paraphrasing and classification by Lumi News Pipeline v1.2.9.