Bottom line: Google releases Gemma 4 12B as an Apache-2.0-licensed multimodal model with unified architecture that runs locally on laptops with 16 GB VRAM and combines text, image, audio, and reasoning.
Google releases Gemma 4 12B, a 12-billion-parameter model that offers multimodal capabilities and runs locally on laptops with 16 GB VRAM — without cloud dependency. The model is available under Apache 2.0 license for free use.
Gemma 4 12B is an open AI model with 12 billion parameters optimized for local execution on standard hardware. The minimum requirement is 16 gigabytes of VRAM. According to the manufacturer, the compact model achieves performance that comes close to significantly larger systems while substantially reducing memory requirements. The release under Apache 2.0 permits commercial use, modification, and local deployment without significant legal restrictions.
The model is implemented as multimodal: it natively processes text, images, and audio as well as complex logical reasoning. This enables use cases in automated software development, media creation, scientific research, and industrial automation — all without external cloud infrastructure. The trend toward on-device AI that runs on user endpoint devices continues accordingly.
The technological innovation lies in the unified system architecture: while classical multimodal models use separate encoders for different data types (image, audio, text), Gemma 4 12B processes all information types through a single architecture. This elimination of additional processing steps simultaneously reduces memory requirements, computational overhead, and latency. The model remains compact enough for stable execution on modern hardware without cloud connectivity.
Source: www.it-daily.net · Published June 5, 2026
Lumi AI News — AI-assisted curation pursuant to Art. 50 EU AI Act. Paraphrase and classification via Lumi News Pipeline v1.6.0.