Security teams are buying AI defenses that don’t work. Researchers from OpenAI, Anthropic, and Google DeepMind published findings in October 2025 that should stop every CISO mid-procurement. Their paper, “The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections,” tested 12 published AI defenses, with most claiming near-zero attack success rates. The research team achieved bypass rates above 90% on most defenses. The implication for enterprises is stark: Most AI security products are being tested against attackers that don’t behave like real attackers.The team tested prompting-based, training-based, and filtering-based defenses under adaptive attack conditions. All collapsed. Prompting defenses achieved 95% to 99% attack success rates under adaptive attacks. Training-based methods fared no better, with bypass rates hitting 96% to 100%. The researchers designed a rigorous methodology to stress-test those claims. Their approach included 14 authors and a $20,000 prize pool for successful attacks.Why WAFs fail at the inference layerWeb application firewalls (WAFs) are stateless; AI attacks are not. The distinction explains why traditional security controls collapse against modern prompt injection techniques.The researchers threw known jailbreak techniques at these defenses. Crescendo exploits conversational context by breaking a malicious request into innocent-looking fragments spread across up to 10 conversational turns and building rapport until the model finally complies. Greedy Coordinate Gradient (GCG) is an automated attack that generates jailbreak suffixes through gradient-based optimization. These are not theoretical attacks. They are published methodologies with working code. A stateless filter catches none of it.Each attack exploited a different blind spot — context loss, automation, or semantic obfuscation — but all succeeded for the same reason: the defenses assumed stat …