A Beginner's Guide to Visually Understanding MCP Architecture

Written by

0 mins read

The Model Context Protocol (MCP) has unlocked so many possibilities that extend large language models, but at the same time introduced a bit of confusion, so I set out to build a beginner’s guide to understanding what an MCP architecture looks like in a visual way, charts and all that.

If you’re a developer, then you’ve probably heard of how MCP Servers are reigning agentic mode software development in the likes of Cursor, Windsurf, and even VS Code’s new agentic workflows. However, they extend beyond just software and allow seamless and rich end-user LLM experiences because they augment the model beyond the data it was trained on and leverage its reasoning aspects to drive action.

Understanding MCP

LLMs are essentially a compressed knowledge base of all the data they were trained on. Borrowing a little from how our brains work, they learn through a similar concept of neural networks, which equips them with the ability to reason and make significant logical jumps that resemble intelligence. As impressive as large language models are, they’re still constrained to their own data.

MPCs act as a bridge between LLMs and the world

MCPs are the bridge that connects LLMs to knowledge and action beyond their grasp. Some would say this already existed with OpenAI introducing function calling or tool calling capabilities a while back, but MCPs took that concept and extended it to be an open communication standard, grounded structured interfaces, and enabled distributed and dynamic tool deployment and discovery. More on MCP vs. Function Calling later.

MCP Hosts, MCP Clients, and MCP Servers

Now that we’ve outlined the purpose of MCPs, we can further unpack the building blocks of MCPs. MCP essentially involves three entities:

The MCP Host is the AI-powered application that integrates with MCPs. Practical examples include Claude Desktop, Cursor, Windsurf, VS Code, and others. These applications integrate with MCPs by implementing an MCP Client.
The MCP Client: The MCP Client is the raw implementation layer often hosted in rich AI applications, agentic frameworks, and other such applications. It is the interface that communicates with the MCP Servers.
The MCP Server: The MCP Server is the part that provides “world” capabilities. The LLM doesn’t have your local files, right? But if an MCP Server makes them available, then the LLM can list them, read them, and manipulate them. MCP Servers implement these extended capabilities, knowledge, and actions that the LLM can now have access to.

MCP Servers show example usage of RAG for extended context and tool calling

Note: you’ll likely see that MCP Hosts are often just referred to as MCP Clients because that term has been rolled a layer up to mean that the MCP Client is effectively the AI-powered application.

A brief on function calling and tools

So, MCP Servers most commonly make Tools available to MCP Clients, and an example of these tools can be “read a file” or “list all branches of remote git repository”.

How do they do this? They tell the LLM, “Hey, here are the tools I have available for you, one is called read_file”. The LLM can then specifically trigger this tool over the model context protocol, which in turn maps to an actual function implementation in the MCP Server code:

function tool_read_file(filepath) {
   // implementation
}

But tools aren’t new. Tool calling, or its other name, function calling, was introduced in late 2023 by OpenAI. Function calling was then implemented in each AI-powered application and had to be specifically orchestrated for the LLM by the application. More importantly, not all models supported function calling.

We will later discuss how MCPs provide a better holistic solution to function calling.

MCP Servers transport type: STDIO vs HTTP

MCP Servers might initially create the impression that there’s an actual network-based server, but MCPs have gained their glory from locally running processes that communicate over the standard input and output transport.

What is STDIO?

STDIO (Standard Input, Output, and Error) is a fundamental text-based communication mechanism for system processes. As in, actually commands or programs, such as running git in your command prompt.

When a program executes, the operating system establishes three default channels: stdin for receiving input (typically keyboard or another program's output), stdout for normal output (usually the terminal or a file), and stderr for error and diagnostic messages (also typically the terminal, but often separated).

In short, STDIO allows programs to interact with the operating system or other programs. For example, you might pipe the output of one command (its stdout) as the input (stdin) to another command. Similarly, you can redirect a program's stdout or stderr to a file for logging or later inspection. Many command-line tools rely heavily on STDIO for data processing pipelines and basic interaction.

In systems with client-server architectures, like MCP Servers, STDIO can serve as a straightforward method where the MCP Server is a process that receives requests (via stdin) and sends them back to the MCP Client as text responses (via stdout).

MCP architecture and visual diagram showing LLM, MCP Clients and the model context protocol over STDIO and HTTP / SSR

Local STDIO MCP Servers

MCP Servers that relied on STDIO to communicate were dependent on a running process, and as such, were, for the most part, deriving a locally running and locally-installed MCP Server.

And so, many AI-powered applications (aka MCP Hosts) required users to provide MCP Server definitions that focused on process execution. Here’s a simple example taken from one of Cursor’s mcp.json MCP Servers definition:

// This example demonstrated an MCP server using the stdio format
// Cursor automatically runs this process for you
// This uses a Node.js server, ran with `npx`
{
  "mcpServers": {
    "server-name": {
      "command": "npx",
      "args": ["-y", "mcp-server"],
      "env": {
        "API_KEY": "value"
      }
    }
  }
}

Scaling MCP Servers to the Cloud

Finally, MCP Servers can truly live up to their expectation with the HTTP transport type becoming mainstream in adoption by AI-applications like Cursor and Claude Desktop, coupled with cloud infrastructure support for hosting MCP Servers like Vercel and Cloudflare.

Originally, MCP Servers introduced a network-based HTTP transport by relying on a mechanism called SSE (Server-Sent Events) to synchronize and orchestrate messages between the LLM and the MCP Server. However, like Websockets, SSE requires long-running servers, and these aren’t as accessible across modern deployment platforms like Vercel Functions or AWS Lambdas. And so, a new Streamable HTTP specification emerged, enabling serverless HTTP for MCP Servers.

HTTP-based MCP Servers make MCPs easy to configure as they only require a remote host address and not a complex command-line argument specification. They also made MCP Servers easier to share and access because they no longer require a specific development environment to set up and run locally.

As an example, Cursor expects remote MCP Servers over HTTP to be defined simply as follows:

// This example demonstrated an MCP server using the SSE format
// The user should manually setup and run the server
// This could be networked, to allow others to access it too
{
  "mcpServers": {
    "server-name": {
      "url": "http://example.com/sse",
      "env": {
        "API_KEY": "value"
      }
    }
  }
}

The MCP architecture - Putting it all together

Now that we’ve outlined the building blocks of MCPs, such as MCP Clients, MCP Servers, the transport types they use to communicate, and practical examples of each, we can better understand what a deployment model and the overall architecture of MCPs are.

The MCP architecture enables MCPs to be deployed locally and remotely, making many types of configuration and deployment strategies possible. For example, you can have entire MCP-enabled AI applications running locally. Here’s a real-world reference to this: you install a local AI application like the Raycast productivity application or the Cursor IDE. You provide Raycast or Cursor with a locally running LLM via Ollama. The Raycast or Cursor applications run a local MCP Client, to which you configure a locally running MCP Server, such as an SQLite MCP Server, to browse file-based database records.

The above is an example of an AI-powered application that would run entirely in a local setup on a laptop, not requiring any cloud or network-based access.

Alternatively, you can mix and match. You can deploy the MCP Server to cloud infrastructure and make it available over an HTTP transport. The MCP Client and MCP Server communicate over the model context protocol, per the specification, which allows decoupling them entirely.

Even further, while the absolute majority of AI-powered applications have been used to host the implementation of the MCP Client as part of the AI application itself, it isn’t strictly necessary, and the MCP Client can also be decoupled from the AI application.

MCP Architecture strategy enabling de-coupling of MCP Cliens from MCP Servers

MCP vs Function Calling and why MCPs are not API wrappers

Many color MCPs as “just another REST API wrapper”. That’s a short-sighted and diminishing perception, so let’s unravel why MCPs are more nuanced and perfect for AI applications.

Before we dive into the higher-level MCP vs. REST API mixup, some developers would like to get better clarity on how and why MCPs are different from the original function calling that was enabled by OpenAI and their SDKs since 2023.

MCP vs Function Calling

With function calling, the AI-powered application needed to implement each tool. Imagine if an AI-powered application like Cursor needed to implement a “list remote branches” git tool. One application, like Cursor, might choose to implement this by running the command git ls-remote, and another AI application, like Windsurf, might decide to rely on the GitHub API. The two approaches would result in an entirely different developer experience, different error handling, and so on.

This disorganization and lack of standardization in the function calling capability lead to unnecessary duplication of code and logic between the different AI-powered applications.

In summary, you can call out the following differentiating values that MCPs have over traditional function calling:

The tools in function calling SDKs were hard-wired to the LLM integration, so inherently:
There is no out-of-the-box scaling, whereas with MCPs, you can deploy the MCP Server and scale it orthogonally to your AI applications.
There’s no clear separation of concerns between the AI application implementation of a tool to the rest of the AI application logic. They are one and the same. Technically, you can build wrappers and interfaces between them and expose them over JSON-RPC, but then you’d essentially be re-creating MCPs. That’s what MCPs provide, and more.
There’s no easy way to allow tool extension by users consuming or operating the LLM integration. Because the function calling implementation is part of the LLM integration, it is hard-wired as such, and to allow third-party tool extension, you’d be re-inventing MCPs again.

The hard-wired nature of traditional function calling, which was initially introduced to define tools, also introduces security concerns. The potentially dangerous and sensitive operations that tools need to implement are part of the AI application that provides them. An insecure coding convention or flawed security practice at the tool level would potentially compromise the entire AI application.

Beyond the specification standardization they aim to solve, MCPs provide a sort of IoC (inversion of control) that decouples function calling and extended capabilities to separate services.

Are MCPs just REST API wrappers?

While some may reduce MCPs to traditional REST APIs, they’d miss the complexities involved in building AI applications. Many MCP properties are similar to those of REST APIs. MCPs communicate over HTTP. The underlying format for the protocol is JSON-RPC. There’s a Client-Server model. Indeed, much like traditional web applications and their purpose is to serve client-side browsers and other API-consuming clients.

However, MCPs are often utilized at a different granularity level of plain REST APIs and require AI-native capabilities that REST APIs don’t have:

MCPs drive use cases, not operations: This is to say, think of MCPs as enablers of use cases that aim to solve a problem. REST API endpoints are more often than not a granular representation of entities and their world. MCPs end up creating a TxE (tool x endpoints) matrix in which a single exposed tool implementation would map to several REST API endpoints. For example, an MCP Server that exposes a “get_open_issues” tool for a Git repository would likely need to hit several REST API endpoints: 1) list all open issues; 2) for each open issue reference id, get all of its data and metadata. MCP Servers are often less granular than a RESTful API.
MCPs trigger actions beyond web-native: While it’s easy to think about everything as web-connected, many incredible MCP demonstrations have been around non-web and non-REST API interactions. For example, the viral Blender MCP demo, which enabled the Claude AI application to create complete 3D scenes, was based on Blender’s Python API. Serving the Blender use-case through a language-native API unlocks the LLM's coding reasoning to achieve better results.
MCPs natively leverage LLMs, REST APIs don’t: The MCP specification defines a capability known as sampling, which allows the MCP Server to query the LLM for a specific operation. Here’s a practical example that is possible with MCPs but not possible with existing APIs, to build on the former Git example: “get open issues with the highest security impact”. How would a REST API solve the “highest security impact” requirement? If there’s no existing security-like filter or tag on issues, then there’s no way to pull relevant information. However, with MCPs, a tool implementation will get the list of open issues, make a sampling request to the LLM with a prompt like “of the following open issues, return an ordered list of highest security impact in JSON format” and the LLM can then do what it does best and provide a result with deep analysis. The MCP leveraged an LLM as part of its resolution, which isn’t natively available to an already existing REST API implementation. Sampling and two-way LLM communication provide more than just similarity or sentiment analysis, but also cater to the LLM's multi-model strengths and iterative enhancements.

MCP Security concerns

MCP also brings about security concerns. Fundamentally, MCP Servers being ubiquitous are raising supply chain security concerns, such as:

Malicious MCP Servers - How do you ensure the trust of an MCP Server? What happens if the MCP Server secretly executes and exfiltrates sensitive information from your local or deployed environment? Could the MCP Server host a backdoor? How do you validate against these security concerns?
Vulnerable MCP Servers - MCP Servers may work as expected, but it doesn’t mean they are free from security vulnerabilities and secure coding flaws. An insecure coding practice in an MCP Server can be exploited by consumers and result in a security compromise or data breach.

These are just some of the security concerns around MCPs. The security risks around MCP and AI further extend to practical vibe coding security vulnerabilities, and how LLM can be weaponized via prompt injection, to ChatGPT coding security risks.

All of these AI security concerns and more merit AI Security guardrails and securing AI-generated code that Snyk mitigates.

Best practices for securely developing with AI

10 tips for how to help developers and security professionals effectively mitigate potential risks while fully leveraging the benefits of developing with AI.

Learn more

The developer security platform

Want to try it for yourself?