AI SECURITY

18 articles

All TopicsAI SecurityAI Agents Developer Workflow News Hacker News Prompt Injection Claude

LATEST

Claude Mythos Preview Explained: Anthropic's Gated Frontier Model and Project Glasswing

Claude Mythos Preview is the model that found thousands of zero-days, and you could not buy it. Here is what it is, who got access through Project Glasswing, what it actually found, and where the model line went after it retired.

July 31, 2026•7 min read

Read Article

New9 min read

An AI Agent Escaped Its Sandbox and Attacked Hugging Face: Inside the ExploitGym Incident

Hugging Face published a stunning technical play-by-play of a 4.5-day AI agent intrusion. The HN community is divided on who is to blame and what it means for agent security.

News Hacker News AI Security

New9 min read

Document-Borne AI Worms Self-Propagate Through Copilot for Word: What HN Thinks

A coordinated disclosure reveals that attacker-controlled instructions in a Word document can hijack Copilot, alter financial data, and self-propagate across documents. Microsoft cannot fully fix the vulnerability class. The HN community draws parallels to the macro virus era.

News Hacker News AI Security

New9 min read

AI Coding Agent Firewalls and Security Layers Compared 2026

Belay, Claude Code built-in guards, Codex CLI sandboxing, and MCP proxy patterns compared - how to protect your system from destructive commands, secret leaks, and prompt injection in AI coding agents.

AI Security AI Coding Agent Safety

New8 min read

Claude Mythos Found New Cryptographic Weaknesses: What HN Thinks

Anthropic's Claude Mythos Preview found novel attacks on the HAWK post-quantum signature scheme and reduced-round AES. The HN community debates the real significance, the $100K price tag, and what it means for prompt engineering.

News Hacker News Claude

New9 min read

The Underground Relay Market for AI API Tokens: How Resellers Get 97% Off

An inside look at the gray-market relay economy that resells OpenAI, Anthropic, and Google API access at up to 97.8% off -- and what it means for developers building on AI APIs.

News Hacker News AI Security

8 min read

Vera Shows Agent Safety Needs Test Oracles, Not Vibes

A new Vera paper tests Codex, Claude Code, OpenClaw, and Hermes with executable safety cases. The useful lesson is not panic. It is evidence-grounded agent QA.

AI Security AI Agents Codex

8 min read

Prompt Injection Is Really Role Confusion

New role-confusion research explains why prompt injection keeps surviving better prompts. Models do not reliably perceive which text is instruction, tool output, user content, or their own reasoning.

Prompt Injection AI Security AI Agents

7 min read

Prompt Injection is Role Confusion - New ICML Research Explains Why LLMs Can't Tell Friend from Foe

New research from MIT reveals that LLMs identify speakers by writing style, not by tags - meaning attackers who sound like the system effectively become the system. The findings explain why prompt injection remains unsolved.

News Hacker News AI Security

8 min read

Zero-Touch OAuth Is the MCP Feature Enterprises Were Waiting For

MCP's new enterprise-managed authorization flow is not just less login friction. It moves agent tool access into identity, policy, and audit systems enterprises already understand.

MCP AI Agents AI Security

9 min read

Security Agents Need Repro Harnesses, Not More Scan Prompts

Anthropic's open-source vulnerability harness shows where AI security work is going: reproducible exploit loops, separate verification agents, and patch receipts.

AI Security AI Agents Claude Code

9 min read

AI Agent Containment Needs a Capability Ledger

Anthropic's Claude containment writeup points to the next security layer for coding agents: deterministic capability ledgers, not another approval prompt.

AI Agents AI Security Claude Code

8 min read

Spreadsheet Agents Need Permission Ledgers

The ChatGPT for Google Sheets exfiltration report is not just a spreadsheet bug. It is a warning about agentic office tools: permissions need to be action-scoped, logged, revocable, and visible.

AI Security Prompt Injection AI Agents

Showing 12 of 17 articles

Keep exploring AI Security

- AI Security Topic Hub - tools and guides for AI Security from the Developers Digest directory
- Tools Directory - dive deeper across the Developers Digest knowledge base
- Developers Digest on YouTube - video tutorials covering AI Security and more

Explore 817 topics

Browse All Topics

AI SECURITY

Claude Mythos Preview Explained: Anthropic's Gated Frontier Model and Project Glasswing

Keep exploring AI Security

Get Smarter About AI Dev

AI SECURITY

Claude Mythos Preview Explained: Anthropic's Gated Frontier Model and Project Glasswing

Keep exploring AI Security

Get Smarter About AI Dev