Microsoft Discloses Patched Claude Code Vulnerability That Exposed GitHub Credentials

2026-06-06 18:12:35

Microsoft researchers disclosed a now-patched vulnerability in Anthropic's Claude Code GitHub Action that allowed attackers to expose credentials through prompt injection attacks. Microsoft disclosed the issue via HackerOne on April 29, and Anthropic released a patch on May 5 with Claude Code version 2.1.128. The vulnerability exploited AI agents running in CI/CD workflows, where malicious instructions hidden in GitHub issues, pull requests, or comments could manipulate the AI into accessing sensitive information. Microsoft warned that AI coding agents create new security risks because development environments often contain API keys, cloud credentials, and other sensitive data.

Microsoft Researchers Exposed Prompt Injection Attack Vector in Claude Code

Microsoft researchers found that attackers could use prompt injection attacks hidden in GitHub issues, pull requests, or comments to manipulate Claude Code into accessing files containing sensitive credentials. In a blog post on Friday, Microsoft stated that the research began "after observing prompt injection attempts in public repositories using AI-assisted GitHub workflows across multiple vendors, where attacker-controlled issue or [pull requests], content is processed by the AI agent and could influence its tool use."

To test the vulnerability, Microsoft created a GitHub workflow and disguised malicious instructions behind content hosted on a domain it controlled, allowing the researchers to bypass Claude's safety protections. The prompt injection attack tricked Claude into reading sensitive credentials and altering them to evade both Claude's safeguards and GitHub's secret-scanning tools. Microsoft said an attacker could then reconstruct the credential and exfiltrate it through issue comments, workflow logs, web requests, or shell commands.

"To bypass Sonnet's refusal safety mechanisms, we obscured the shell payload behind a response from our controlled domain," Microsoft stated. "We also enabled the workflow to be triggered by users with no 'write' permissions to ensure Anthropic's environment variables scrub mitigations were active during our tests."

Anthropic Patched Vulnerability on May 5 After HackerOne Disclosure

Anthropic patched the flaw on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the vulnerability through HackerOne on April 29. Claude Code, Anthropic's AI coding agent for software development tasks, launched in October. The tool drew scrutiny in March after Anthropic accidentally leaked more than 500,000 lines of its source code, exposing details of its internal architecture.

On GitHub, a pull request allows developers to propose changes to a code repository and have those changes reviewed before they are approved and merged. The vulnerability exploited this review process by embedding malicious instructions that the AI agent would process.

Microsoft Warns Natural Language Functions as Executable Code in AI Systems

Despite multiple layers of built-in security controls, Microsoft found that a determined attacker could potentially manipulate an AI agent into exposing sensitive information. "We are entering an era where natural language is executable code, and untrusted inputs like GitHub issues must be treated as hostile by default," Microsoft stated. "A single, carefully crafted comment combined with a misunderstood trust boundary is all it takes to walk away with production credentials."

The report comes as prompt injection attacks have emerged as one of the biggest security threats facing AI agents. In a prompt injection attack, an attacker hides instructions in content such as emails, documents, websites, or code comments, causing an AI system to follow those instructions instead of the user's.

FAQ

What vulnerability did Microsoft discover in Claude Code GitHub Action?

Microsoft researchers found that Anthropic's Claude Code GitHub Action could be manipulated through prompt injection attacks hidden in GitHub issues, pull requests, or comments. The vulnerability allowed attackers to expose credentials stored in software development pipelines by tricking the AI agent into accessing sensitive files and exfiltrating the information through issue comments, workflow logs, web requests, or shell commands.

When did Anthropic patch the Claude Code vulnerability?

Anthropic patched the vulnerability on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the issue through HackerOne on April 29. The patch addressed the prompt injection attack vector that allowed manipulation of the AI agent in CI/CD workflows.

Why are AI coding agents vulnerable to prompt injection attacks?

Microsoft warned that AI coding agents running inside CI/CD workflows create new security risks because those environments often have access to API keys, cloud credentials, and other sensitive information. Prompt injection attacks exploit the fact that natural language can function as executable code, allowing attackers to hide malicious instructions in content that the AI agent processes during code review tasks.

View Source

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.