Topic
Security for AI agents and LLM apps - prompt injection, tool permissions, audit logs, sandboxing, and rollback.
11 resources - 11 posts

New role-confusion research explains why prompt injection keeps surviving better prompts. Models do not reliably perceive which text is instruction, tool output, user content, or their own reasoning.

New research from MIT reveals that LLMs identify speakers by writing style, not by tags - meaning attackers who sound like the system effectively become the system. The findings explain why prompt injection remains unsolved.

MCP's new enterprise-managed authorization flow is not just less login friction. It moves agent tool access into identity, policy, and audit systems enterprises already understand.

Anthropic's open-source vulnerability harness shows where AI security work is going: reproducible exploit loops, separate verification agents, and patch receipts.

Anthropic's Claude containment writeup points to the next security layer for coding agents: deterministic capability ledgers, not another approval prompt.

The ChatGPT for Google Sheets exfiltration report is not just a spreadsheet bug. It is a warning about agentic office tools: permissions need to be action-scoped, logged, revocable, and visible.

Before an AI agent gets tools, files, APIs, MCP servers, or deployment access, decide what it can read, write, call, log, and roll back.

AI coding agents become safer when permissions, logs, and rollback are designed as one system. Here is the operating loop I would put around any agent that can edit code, run tools, or open pull requests.

Prompt injection stops being an abstract LLM risk once an agent can call tools. The practical defense is data boundaries, structured handoffs, tool guardrails, and approval gates around side effects.

Anthropic's Project Glasswing update is a useful signal for developer teams: AI can find vulnerability candidates faster than humans can verify, disclose, patch, and ship them.

AI coding agents now read repository docs, config, issues, and comments before opening pull requests. That turns CONTRIBUTING.md and AGENTS.md into part of the security boundary.
Keep exploring

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Explore 659 topics
Browse All Topics