In short: The llm-gemini plugin adds support for reasoning token streaming in version 0.32a0 to enable real-time observation of model thinking processes.
The LLM plugin llm-gemini has been released in version 0.32a0 and now enables streaming of reasoning tokens when using Google Gemini models. Compatibility requires llm >= 0.32a0.
The llm-gemini plugin gives practitioners access to the Google Gemini model family via the LLM ecosystem. With version 0.32a0, a new capability is added: streaming of reasoning tokens.
Reasoning tokens are intermediate outputs that the model produces during its thinking processes. By streaming these tokens, users can observe in real time how the model arrives at its answers, rather than waiting for the complete response.
The update requires at least llm version 0.32a0 (Alpha). For practitioners using Gemini models via the LLM tool, this provides improved debugging and interpretability capabilities.
Source: ainews-dev.lumi-systems.io · Published May 19, 2026
Lumi AI News — AI-assisted curation pursuant to Art. 50 EU AI Act. Paraphrase and classification via Lumi News Pipeline v1.5.2.