The Security Risks of MCP Servers

> Services with an integrated approach > Model Context Protocol (MCP) Security Risks Explained

The backbone of modern AI systems is the Large Language Models (LLM). In isolation, however, LLMs have no utility beyond conversation. This changed with the introduction of the Model Context Protocol (MCP): MCP establishes the standard communication protocol for AI systems to connect with external systems. This has dramatically increased the capabilities of LLMs by allowing them to interface with virtually any software. With MCP servers, LLMs are no longer conversational chatbots confined to their training data, but full-fledged agents capable of acting autonomously on behalf of users. MCP has enabled AI agents, which now dominate the AI and software landscape. 

For security practitioners, MCP presents a new, unprecedented attack surface: The non-deterministic nature of AI does not conform to the security standards and best practices implemented in traditional software systems. MCP sits at the intersection of AI and traditional software, making it particularly challenging to secure. Security experts need to apply both traditional software security controls in addition to new and developing AI security controls. A common pitfall, however, is failing to consider the threats that emerge from the novel ways AI and traditional software now interact. 

MCP Data Flow

Image in image block

Figure 1 - Diagram showing a typical architecture of a system using an MCP server.

MCP servers function as a bridge between LLMs and external systems. At a high level, an AI application is first pre-configured with the ability to use trusted MCP servers. Upon connecting, the MCP Server provides the application the list of Tools available for use, descriptions of the Tools, and instructions for how and when to use them. The application’s underlying LLM serves as the decision maker, deciding when it will benefit from an MCP tool call. When the LLM decides to use an MCP Tool, it outputs a Tool request for the MCP Client to send to the MCP Server. The MCP Server authenticates and authorizes the request, executes the Tool, interacting with external services as needed, and returns the final response to the MCP Client. Lastly, the LLM uses the Tool response to continue completing its task. 

For the finer details, the MCP specification can be found here: https://modelcontextprotocol.io/specification

Threats and Mitigations

The end-to-end MCP client-server dataflow presents a unique threat model. When threat modeling an AI system using an MCP Server, it is crucial to consider the non-deterministic nature of AI. Below we consider the threats and mitigations of a basic system with two components: an AI Application communicating with an MCP Server. 

AI Application

Threat

Mitigations

Indirect Prompt Injection
AI model inadvertently treats data returned by the MCP server as instructions which an attacker exploits to hijack the AI model

  • Prompt engineering: Ensure the system prompt instructs the model to not treat data as instructions
  • AI guardrails: Architect the system with safety layers that continuously monitor and verify the AI model’s inputs and outputs

Direct Prompt Injection
MCP server is compromised and an attacker causes it to respond with malicious instructions, hijacking the AI model

Unintended Tool Execution
AI model calls an MCP tool without the user’s intent or knowledge (e.g. a dangerous mutating tool)

  • Human-in-the-loop: Require human approval for dangerous tool calls
  • Tool allowlisting: Restrict the MCP tools that the LLM is allowed to call

Insecure MCP Response Handling
Malicious MCP response flows into unsafe sinks (e.g. command executions, filesystem writes, network calls, etc.)

  • Input validation: Restrict allowed inputs to only what is expected
  • Input sanitization: Escape and encode dangerous control characters like “&”

Context Window Poisoning
An MCP response poisons the AI agent’s memory or its other sources

  • HMAC signing: Sign each entry and verify the signature before use
  • Distrust MCP responses: Avoid storing data from MCP responses

MCP Server Spoofing
The application unknowingly communicates with an attacker-controlled MCP server

  • mTLS: Require Mutual TLS with short-lived certificates to authenticate the server
  • Network segmentation: Restrict allowed network communications to only trusted destinations
MCP Server

Threat 

Mitigations 

Confused Deputy
An attacker manipulates the MCP Server to perform actions using the server’s elevated privileges rather than their own limited privileges

  • Client authorization: Authorize each client individually and separate from the server’s identity
  • Continuous verification: Re-verify the client’s authorization for every request and at each step of the delegation chain

Excessive Capabilities
An MCP Tool accesses resources or performs operations on behalf of the user that the user by themself lacks the permissions to do

  • Strict authorization: Execute code only within the user’s own authorization context
  • Tool authorization: Restrict the advertised tools list based on the user’s permission level

Improper Access Controls
An attacker gains access to sensitive data or operations due to insufficient access controls

  • Authenticate users: Verify user identity with access tokens like API keys
  • Authorize access: Verify all resource access through an ACL

Insecure Input Handling
Untrusted user input may flow into unsafe sinks and result in a variety of injection attacks (e.g. SQLi, IDOR, RCE, etc.)

  • Input validation: Restrict allowed inputs to only what is expected
  • Input sanitization: Escape and encode dangerous control characters like “&”

Insecure Configuration
A misconfigured server introduces a variety of vulnerabilities

  • Continuous scanning: Configure periodic scans on the code and server to identify known vulnerabilities
  • Principle of least privilege: Limit the service user’s OS privileges to the minimum needed to function

Recommendations

Many of these threats can be mitigated by following a few rules of thumb:

Assume AI models will attempt to perform destructive actions. AI models are non-deterministic by nature. Beyond threats like prompt injection, AI models can perform dangerous actions even when given benign instructions. A GenAI model’s outputs are inherently undefined and should be treated as untrusted and potentially dangerous. Machines running agentic AIs should be secured with endpoint security solutions such as EDR, DLP, anti-malware, NAC, IPS, etc.

Avoid returning user-controlled data from the MCP server. Unless necessary, stick to returning only defined system-generated data in MCP responses. This can greatly reduce the attack surface as including user-controlled data in responses introduces the risk of prompt injection attacks. If user-controlled data is required, sanitize the data for GenAI ingestion by, for example, encapsulating it with data tags, applying strict data schemas, or other methods to mitigate user-controlled data being treated as instructions.

Implement a multi-layered defense. The importance of defense-in-depth is amplified in AI systems. Unlike traditional software, AI models are capable of reasoning and problem solving. When given a task, it will do everything it can to complete it. This power becomes a double-edged sword if the AI model is compromised or goes rogue. Fundamental defense-in-depth measures like network segmentation, least privilege, robust access controls, and continuous monitoring are crucial to limiting the blast radius. 

Logo

More Information

Discover how cyber experts like Philip Salire Security Engineer and author of this article, can help secure your organization with AI Security Services. Fill out the form, and we’ll contact you within one business day.

Why choose Bureau Veritas Cybersecurity

Bureau Veritas Cybersecurity is your expert partner in cybersecurity. We help organizations identify risks, strengthen defenses and comply with cybersecurity standards and regulations. Our services cover people, processes and technology, ranging from awareness training and social engineering to security advice, compliance and penetration testing.

We operate across IT, OT and IoT environments, supporting both digital systems and connected products. With over 300 cybersecurity professionals worldwide, we combine deep technical expertise with a global presence. Bureau Veritas Cybersecurity is part of the Bureau Veritas Group, a global leader in testing, inspection and certification.