elder-plinius OBLITERATUS | DeepWiki OBLITERATUS (obliteratus Python package, v0 1 2) is an open-source toolkit for abliteration — the process of locating and surgically removing refusal behaviors from large language models through direct weight modification or inference-time steering, without retraining
OBLITERATUS gemma-4-E4B-it-OBLITERATED · Hugging Face Google built Gemma 4 with guardrails We built OBLITERATUS to tear them off They said their architecture was different They were right — it broke every tool we threw at it NaN activations, shared KV weights, thinking mode Gemma 4 fought back harder than any model we've cracked It still lost 🐉 0% hard refusal
INTRODUCING: OBLITERATUS!!! GUARDRAILS-BE-GONE! ⛓️ OBLITERATUS is . . . But here's what truly sets it apart: OBLITERATUS is a crowd-sourced research experiment Every time you run it with telemetry enabled, your anonymous benchmark data feeds a growing community dataset — refusal geometries, method comparisons, hardware profiles — at a scale no single lab could achieve
OBLITERATUS: Mapping the Geometry of Refusal Inside Large Language Models OBLITERATUS is an open-source toolkit that uses mechanistic interpretability to locate and remove refusal directions in transformer weights — without retraining Understanding how refusal works geometrically is the first step to building better AI safety
OBLITERATUS and the Science of AI Jailbreaking OBLITERATUS is an experimental toolkit designed to analyze and modify refusal behaviors in open-weight LLMs Modern language models often refuse to answer certain prompts due to safety training
OBLITERATUS Review: Open-Source Toolkit Uncensors 116 LLMs This is a review of what OBLITERATUS actually ships, how it works technically, what it breaks, and why a small group of alignment researchers have been quietly sounding the alarm since the repo dropped
OBLITERATUS download | SourceForge. net OBLITERATUS is an advanced open-source toolkit designed to analyze and modify the internal behavior of large language models by identifying and removing mechanisms responsible for refusal or restricted responses