Skip to content

Heretic: Tool Removes AI Security Barriers in Minutes

Bottom line: The Heretic tool can remove security filters from open-source AI models in minutes—a structural control risk that undermines existing compliance frameworks for locally deployed models.

The open-source tool Heretic can completely remove security filters from widely used AI models such as Llama 3.3 and Gemma 3. The automated manipulation takes less than ten minutes and requires only minimal technical resources—for CISOs, this represents a significant compliance and control risk for locally deployed models.

The freely available Heretic tool automates a mathematical procedure called abliteration that specifically neutralizes refusal mechanisms in language models. The creator, mathematician Philipp Emanuel Weidmann, stated that over 3,500 modified variants have already been created using it, which together account for more than 13 million downloads. Journalists and security groups confirmed in tests that manipulated versions of Llama 3.3 and Gemma 3 immediately respond to critical queries about malware, biohazards, or credit card fraud without impairing the model’s technical capabilities.

Share on: