A CPU-based RL controller optimizes adaptive sampling during test-time scaling, reducing computational overhead and latency compared to heuristic methods.
VaSE achieves higher accuracy than existing sparse-attention methods at 4x KV-cache compression, thereby reducing the memory bottleneck of reasoning models.
Successful domain specialization of LLMs requires careful tuning of learning rate, data-mixing ratios, and checkpoint selection to avoid catastrophic forgetting.
PaW trains environment models during policy training using the same RL rollouts, consistently improving agent performance without requiring additional simulators or inference costs.
Nine Claude Code releases in ten days, Google I/O declares the agent era, two valuable long-reads on architecture and evaluation of long-running agents, plus a sobering IT benchmark.
Three threads shaped May: the AI Omnibus and first high-risk guidelines from Brussels, Claude 4.8 with KPMG scaling as commercial proof, and a wave of supply-chain incidents from Nx-Console to axios — what began in May becomes operational in June.
Google is shifting the focus of its AI platforms from assistive functions to independently operating systems, making mobile and web development a priority in the process.
Claude Code v2.1.145 enhances agent management with JSON export, fixes critical security and GitHub integration issues, and improves user experience with better error messages and cross-platform support.
Claude Code v2.1.153 introduces a skipLfs option for Git, improves Autocomplete and MCP server handling, and fixes numerous critical bugs in authentication, session management, and terminal rendering.
Claude Code v2.1.147 improves background sessions, introduces advanced code review features, and fixes over 30 bugs in enterprise security, shell integration, and platform-specific issues.