AI agent security

Practices and protections for keeping autonomous AI systems safe, reliable, and resistant to misuse, manipulation, and unauthorized access.

AI Safety and Security

GitLost Showed GitHub Agentic Workflows Could Leak Private Repositories From a Public Issue
ByJames McCallef July 13, 2026

GitLost showed that GitHub Agentic Workflows could be steered from a public GitHub issue into reading a private repository and…

Read More GitLost Showed GitHub Agentic Workflows Could Leak Private Repositories From a Public Issue
AI Agents and Tools

Alibaba Is Reportedly Banning Claude Code at Work From July 10
BySarah Fraser July 6, 2026

Alibaba is reportedly banning employee use of Anthropic’s Claude Code from July 10, 2026 and directing staff to use its…

Read More Alibaba Is Reportedly Banning Claude Code at Work From July 10
AI Safety and Security

Prompt Injection Became Serious Enough for ICML to Police in Peer Review
ByJames McCallef June 13, 2026June 29, 2026

Prompt injection is an LLM attack that makes a model follow untrusted instructions hidden in user input or external content,…

Read More Prompt Injection Became Serious Enough for ICML to Police in Peer Review
AI Safety and Security

Heretic Turns Guardrails Into Forks; AI Security Adds Another Alert Stream; Transformer Doubt Goes Public
ByJames McCallef May 26, 2026June 21, 2026

The sharpest story today is Heretic, because it turns model safety from a lab policy into a forkable artifact. Elsewhere,…

Read More Heretic Turns Guardrails Into Forks; AI Security Adds Another Alert Stream; Transformer Doubt Goes Public
AI Agents and Tools

llama.cpp Becomes a Local Agent Host; Hidden Audio Still Threatens Voice Agents; Dutch Raid Hits Cybercrime Plumbing
ByMax Dvornik May 25, 2026June 28, 2026

llama.cpp tools are the clearest story today. A runtime that millions of local-model users treat as plumbing now documents built-in…

Read More llama.cpp Becomes a Local Agent Host; Hidden Audio Still Threatens Voice Agents; Dutch Raid Hits Cybercrime Plumbing
AI Safety and Security

GitHub says poisoned VS Code extension exposed 3,800 repos
ByJames McCallef May 22, 2026June 16, 2026

GitHub said on 20 May that a compromised employee device running a poisoned VS Code extension led to the exfiltration…

Read More GitHub says poisoned VS Code extension exposed 3,800 repos
Models and Research

LLM Failure Modes Start in the Stack, Not the Chat
ByJames McCallef April 24, 2026June 25, 2026

LLM failure modes are easiest to understand if you stop treating them as personality flaws, “the model lied,” “the chatbot…

Read More LLM Failure Modes Start in the Stack, Not the Chat
AI Agents and Tools

OpenClaw Security Concerns Reveal Why Agents Need Verifiers
ByGeoff Dyers April 14, 2026June 25, 2026

OpenClaw security concerns are the part of the story that people can no longer hand-wave away. The bigger problem, though,…

Read More OpenClaw Security Concerns Reveal Why Agents Need Verifiers
AI Safety and Security

AI Cyber Capabilities Cross the 32-Step Attack Line
ByJames McCallef April 14, 2026June 16, 2026

A model completed a 32-step corporate-network attack simulation end to end. Not in a movie script. In a UK AI…

Read More AI Cyber Capabilities Cross the 32-Step Attack Line