November 25th, 2024

Introducing The Model Context Protocol

Anthropic has open-sourced the Model Context Protocol (MCP) to enhance AI assistants' integration with data systems, improving response relevance and enabling developers to create secure connections and build connectors.

Read original articleLink Icon
CuriosityAppreciationSkepticism
Introducing The Model Context Protocol

Anthropic has announced the open-sourcing of the Model Context Protocol (MCP), a new standard designed to connect AI assistants with various data systems, including content repositories and business tools. The MCP aims to enhance the relevance and quality of responses generated by AI models, which often struggle due to isolation from data sources. By providing a universal protocol, MCP simplifies the integration process, allowing developers to create secure, two-way connections between their data and AI tools. Key components of MCP include the protocol specification, local server support in Claude Desktop apps, and an open-source repository for MCP servers. Early adopters like Block and Apollo have already integrated MCP, while development tool companies are collaborating to improve their platforms. This initiative is expected to streamline the development process, enabling AI systems to maintain context across different tools and datasets. Developers can begin building MCP connectors immediately, with resources available for testing and implementation. The project emphasizes collaboration and community involvement, inviting feedback from AI tool developers and enterprises.

- Anthropic has open-sourced the Model Context Protocol (MCP) to connect AI assistants with data systems.

- MCP aims to improve AI response relevance by providing a universal integration standard.

- Key features include protocol specifications, local server support, and an open-source repository.

- Early adopters and development tool companies are already integrating MCP into their systems.

- Developers can start building MCP connectors and contribute to the open-source project.

AI: What people are saying
The open-sourcing of the Model Context Protocol (MCP) by Anthropic has generated a variety of responses from the community.
  • Many developers appreciate the potential for standardization and integration of AI tools, reducing fragmentation in the ecosystem.
  • Concerns have been raised about the complexity of implementation and the lack of clear documentation, making it difficult for some users to understand how to utilize MCP effectively.
  • There is skepticism regarding the protocol's ability to handle authentication and authorization for secure data access.
  • Some users express confusion about how MCP differs from existing solutions like OpenAPI and function calling libraries.
  • Overall, there is a mix of excitement for the possibilities MCP offers and caution about its practical application and adoption in the industry.
Link Icon 73 comments
By @somnium_sn - 5 months
@jspahrsummers and I have been working on this for the last few months at Anthropic. I am happy to answer any questions people might have.
By @ianbutler - 5 months
I’m glad they're pushing for standards here, literally everyone has been writing their own integrations and the level of fragmentation (as they also mention) and repetition going into building the infra around agents is super high.

We’re building an in terminal coding agent and our next step was to connect to external services like sentry and github where we would also be making a bespoke integration or using a closed source provider. We appreciate that they have mcp integrations already for those services. Thanks Anthropic!

By @valtism - 5 months
This is a nice 2-minute video overview of this from Matt Pocock (of Typescript fame) https://www.aihero.dev/anthropics-new-model-context-protocol...
By @jascha_eng - 5 months
Hmm I like the idea of providing a unified interface to all LLMs to interact with outside data. But I don't really understand why this is local only. It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.

I guess I can do this for my local file system now?

I also wonder if I build an LLM powered app, and currently simply to RAG and then inject the retrieved data into my prompts, should this replace it? Can I integrate this in a useful way even?

The use case of on your machine with your specific data, seems very narrow to me right now, considering how many different context sources and use cases there are.

By @xyc - 5 months
Just tried out the puppeteer server example if anyone is interested in seeing a demo: https://x.com/chxy/status/1861302909402861905. (Todo: add tool use - prompt would be like "go to this website and screenshot")

I appreciate the design which left the implementation of servers to the community which doesn't lock you into any particular implementation, as the protocol seems to be aiming to primarily solve the RPC layer.

One major value add of MCP I think is a capability extension to a vast amount of AI apps.

By @bluerooibos - 5 months
Awesome!

In the "Protocol Handshake" section of what's happening under the hood - it would be great to have more info on what's actually happening.

For example, more details on what's actually happening to translate the natural language to a DB query. How much config do I need to do for this to work? What if the queries it makes are inefficient/wrong and my database gets hammered - can I customise them? How do I ensure sensitive data isn't returned in a query?

By @ado__dev - 5 months
You can use MCP with Sourcegraph's Cody as well

https://sourcegraph.com/blog/cody-supports-anthropic-model-c...

By @rahimnathwani - 5 months
In case anyone else is like me and wanted to try the filesystem server before anything else, you may have found the README insufficient.

You need to know:

1. The claude_desktop_config.json needs a top-level mcpServer key, as described here: https://github.com/modelcontextprotocol/servers/pull/46/comm...

2. If you did this correctly the, after you run Claude Desktop, you should see a small 'hammer' icon (with a number next to it) next to the labs icon, in the bottom right of the 'How can Claude help you today?' box.

By @jvalencia - 5 months
I don't trust an open source solution by a major player unless it's published with other major players. Otherwise, the perverse incentives are too great.
By @_rupertius - 5 months
For those interested, I've been working on something related to this, Web Applets – which is a spec for creating AI-enabled components that can receive actions & respond with state:

https://github.com/unternet-co/web-applets/

By @lukekim - 5 months
The Model Context server is similar to what we've built at Spice, but we've focused on databases and data systems. Overall, standards are good. Perhaps we can implement MCP as a data connector and tool.

[1] https://github.com/spiceai/spiceai

By @orliesaurus - 5 months
I would love to integrate this into my platform of tools for AI models, Toolhouse [1], but I would love to understand the adoption of this protocol, especially as it seems to only work with one foundational model.

[1] https://toolhouse.AI

By @sunleash - 5 months
The protocol felt unnecessarily complicated till I saw this

https://modelcontextprotocol.io/docs/concepts/sampling

It's crazy. Sadly not yet implemented in Claude Desktop client.

By @thoughtlede - 5 months
If function calling is sync, is MCP its async counterpart? Is that the gist of what MCP is?

Open API (aka swagger) based function calling is standard already for sync calls, and it solves the NxM problem. I'm wondering if the proposed value is that MCP is async.

By @faizshah - 5 months
So it’s basically a standardized plugin format for LLM apps and thats why it doesn’t support auth.

It’s basically a standardized way to wrap you Openapi client with a standard tool format then plug it in to your locally running AI tool of choice.

By @delegate - 5 months
I appreciate the effort, but after spending more than one hour on it, I still don't understand how and why I'd use this.

The Core architecture [1] documentation is given in terms of TypeScript or Python abstractions, adding a lot of unnecessary syntactic noise for someone who doesn't use these languages. Very thin on actual conceptual explanation and full of irrelevant implementation details.

The 'Your first server'[2] tutorial is given in terms of big chunks of python code, with no explanation whatsoever, eg:

    Add these tool-related handlers:
    ...100 lines of undocumented code...

The code doesn't even compile. I don't think this is ready for prime time yet so I'll move along for now.

[1] https://modelcontextprotocol.io/docs/concepts/architecture [2] https://modelcontextprotocol.io/docs/first-server/python

By @_han - 4 months
This is very interesting. I was surprised by how minimal the quickstart (https://modelcontextprotocol.io/quickstart) was, but a lot of details are hidden in this python package: https://github.com/modelcontextprotocol/servers/tree/main/sr...
By @threecheese - 5 months
WRT prompts vs sampling: why does the Prompts interface exclude model hints that are present in the Sampling interface? Maybe I am misunderstanding.

It appears that clients retrieve prompts from a server to hydrate them with context only, to then execute/complete somewhere else (like Claude Desktop, using Anthropic models). The server doesn’t know how effective the prompt will be in the model that the client has access to. It doesn’t even know if the client is a chat app, or Zed code completion.

In the sampling interface - where the flow is inverted, and the server presents a completion request to the client - it can suggest that the client uses some model type /parameters. This makes sense given only the server knows how to do this effectively.

Given the server doesn’t understand the capabilities of the client, why the asymmetry in these related interfaces?

There’s only one server example that uses prompts (fetch), and the one prompt it provides returns the same output as the tool call, except wrapped in a PromptMessage. EDIT: lols like there are some capabilities classes in the mcp, maybe these will evolve.

By @outlore - 5 months
i am curious: why this instead of feeding your LLM an OpenAPI spec?
By @pcwelder - 5 months
It's great! I quickly reorganised my custom gpt repo to build a shell agent using MCP.

https://github.com/rusiaaman/wcgw/blob/main/src/wcgw/client/...

Already getting value out of it.

By @kordlessagain - 5 months
L402's (1) macaroon-based authentication would fit naturally with MCP's server architecture. Since MCP servers already define their capabilities and handle tool-specific requests, adding L402 token validation would be straightforward - the server could check macaroon capabilities before executing tool requests. This could enable per-tool pricing and usage limits while maintaining MCP's clean separation between transport and tool implementation. The Aperture proxy could sit in front of MCP servers to handle the Lightning payment flow, making it relatively simple to monetize existing MCP tool servers without significant modifications to their core functionality.

(1) https://github.com/lightninglabs/aperture

By @lmeyerov - 5 months
Moving from langchain interop to protocol interop for tools is great

Curious:

1. Authentication and authorization is left as a TODO: what is the thinking, as that is necessary for most use?

2. Ultimately, what does MCP already add or will add that makes it more relevant than OpenApI / a pattern on top?

By @pants2 - 5 months
This is awesome. I have an assistant that I develop for my personal use and integrations are the more difficult part - this is a game changer.

Now let's see a similar abstraction on the client side - a unified way of connecting your assistant to Slack, Discord, Telegram, etc.

By @benreesman - 5 months
The default transport should have accommodated binary data. Whether it’s tensors of image data, audio waveforms, or pre-tokenized NLP workloads it’s just going to hit a wall where JSON-RPC can’t express it uniquely and efficiently.
By @gyre007 - 5 months
Something is telling me this _might_ turn out to be a huge deal; I can't quite put a finger on what is that makes me feel that, but opening private data and tools via an open protocol to AI apps just feels like a game changer.
By @melvinmelih - 5 months
This is great but will be DOA if OpenAI (80% market share) decides to support something else. The industry trend is that everything seems to converge to OpenAI API standard (see also the recent Gemini SDK support for OpenAI API).
By @benopal64 - 5 months
If anyone here has an issue with their Claude Desktop app seeing the new MCP tools you've added to your computer, restart it fully. Restarting the Claude Desktop app did NOT work for me, I had to do a full OS restart.
By @orliesaurus - 5 months
Are there any other Desktop apps other than Claude's supporting this?
By @zokier - 5 months
Does aider benefit from this? Big part of aiders special sauce is the way it builds context, so it feels closely related but I don't know how the pieces would fit together here
By @asah - 5 months
How does this work for access controlled data ? I don't see how to pass auth credentials?

Required for

- corporate data sources, e g. Salesforce

- APIs with key limits and non-trivial costs

- personal data sources e.g. email

It appears that all auth is packed into the MCP config, e.g. slack token: https://github.com/modelcontextprotocol/servers/tree/main/sr...

By @skybrian - 5 months
I'm wondering if there will be anything that's actually LLM-specific about these API's. Are they useful for ordinary API integration between websites?
By @singularity2001 - 5 months
Tangential question: Is there any LLM which is capable of preserving the context through many sessions, so it doesn't have to upload all my context every time?
By @wolframhempel - 5 months
I'm surprised that there doesn't seem to be a concept of payments or monetization baked into the protocol. I believe there are some major companies to be built around making data and API actions available to AI Models, either as an intermediary or marketplace or for service providers or data owners directly- and they'd all benefit from a standardised payment model on a per transaction level.
By @yalok - 5 months
A picture is worth a 1k words.

Is there any good arch diagram for one of the examples of how this protocol may be used?

I couldn’t find one easily…

By @serialx - 5 months
Is there any plans to add Well-known URI[1] as a standard? It would be awesome if we can add services just by inputting domain names of the services.

[1]: https://en.wikipedia.org/wiki/Well-known_URI

By @orliesaurus - 5 months
How is this different from function calling libraries that frameworks like Langchain or Llamaindex have built?
By @bentiger88 - 5 months
One thing I dont understand.. does this rely on vector embeddings? Or how does the AI interact with the data? The example is a sqllite satabase with prices, and it shows claude being asked to give the average price and to suggest pricing optimizations.

So does the entire db get fed into the context? Or is there another layer in between. What if the database is huge, and you want to ask the AI for the most expensive or best selling items? With RAG that was only vaguely possible and didnt work very well.

Sorry I am a bit new but trying to learn more.

By @m3kw9 - 5 months
So this allows you to connect your sqllite to Claud desktop, so it executes sql commands on your behalf instead of you entering it, it also chooses the right db on its own, similar to what functions do
By @dr_kretyn - 5 months
It took me about 5 jumps before learning what the protocol is, except learning that it's something awesome, and community driven, and open source.
By @gjmveloso - 5 months
Let’s see how other relevant players like Meta, Amazon and Mistral reacts to this. Things like these just make sense with broader adoption and diverse governance model
By @_pdp_ - 5 months
It is clear this is a wrapper around the function calling paradigm but with some extensions that are specific to this implementation. So it is an SDK.
By @johtso - 5 months
Could this be used to voice control an android phone using Tasker functions? Just expose all the functions as actions and then let it rip?
By @hipadev23 - 5 months
Can I point this at my existing private framework and start getting Claude 3.5 code suggestions that utilize our framework it has never seen before?
By @andrewstuart - 5 months
Can someone please give examples of uses for this?
By @wbakst - 5 months
are there any examples of using this with the anthropic API to build something like Claude Desktop?

the docs aren't super clear yet wrt. how one might actually implement the connection. do we need to implement another set of tools to provide to the API and then have that tool call the MCP server? maybe i'm missing something here?

By @vkeenan - 5 months
I think the most concise way to describe Anthropic MCP is that it's ODBC for AI.
By @nsiradze - 5 months
This is something new, Good job!
By @gregjw - 5 months
Sensible standards and open protocols. Love to see the industry taking form like this.
By @eichi - 5 months
I eventually return from every brabra protocol/framework to SQL, txt, standard library, due to inefficiency of introducing meaningless layer. People or me while a go often avoid confronting difficult problems which actually matters. Rather worse, frameworks, buzz technology words are the world of incompetitive people.
By @Havoc - 5 months
If it gets traction this could be great. Industry sure could do with some standardisation
By @rty32 - 5 months
Is this similar to what Sourcegraph's OpenCtx tries to do?

Has OpenCtx ever gained much traction?

By @alberth - 5 months
Is this basically open source data collectors / data integration connectors?
By @recsv-heredoc - 5 months
Thank you for creating this.
By @prnglsntdrtos - 5 months
really great to see some standards emerging. i'd love to see something like mindsdb wired up to support this protocol and get a bunch of stuff out of the box.
By @Sudheersandu1 - 5 months
Is it Datacontext that is aware as and when we add columns in the db what it means. How can we make every schema change that happens on db is context aware that is not clear.
By @rch - 5 months
Strange place for WS* to respawn.
By @ironfootnz - 5 months
L0L, this is basically OpenAI spec function calls with a different semantics.
By @mwkaufma - 5 months
Spyware-As-A-Service
By @bradgessler - 5 months
If you run a SaaS and want to rapidly build out a CLI that you could plug into this ~and~ want something that humans can use, check out the project I’ve been working on at https://terminalwire.com

tl;dr—you can build & ship a CLI without needing an API. Just drop Terminalwire into your server, have your users install the thin client, and you’ve got a CLI.

I’m currently focused on getting the distribution and development experience dialed in, which is why I’m working mostly with Rails deployments at the moment, but I’m open to working with large customers who need to ship a CLI yesterday in any language or runtime.

If you need something like this check it out at https://terminalwire.com or ping me brad@terminalwire.com.

By @juggli - 5 months
Computer Science: There's nothing that can't be solved by adding another layer.
By @killthebuddha - 5 months
I see a good number of comments that seem skeptical or confused about what's going on here or what the value is.

One thing that some people may not realize is that right now there's a MASSIVE amount of effort duplication around developing something that could maybe end up looking like MCP. Everyone building an LLM agent (or pseudo-agent, or whatever) right now is writing a bunch of boilerplate for mapping between message formats, tool specification formats, prompt templating, etc.

Now, having said that, I do feel a little bit like there's a few mistakes being made by Anthropic here. The big one to me is that it seems like they've set the scope too big. For example, why are they shipping standalone clients and servers rather than client/server libraries for all the existing and wildly popular ways to fetch and serve HTTP? When I've seen similar mistakes made (e.g. by LangChain), I assume they're targeting brand new developers who don't realize that they just want to make some HTTP calls.

Another thing that I think adds to the confusion is that, while the boilerplate-ish stuff I mentioned above is annoying, what's REALLY annoying and actually hard is generating a series of contexts using variations of similar prompts in response to errors/anomalies/features detected in generated text. IMO this is how I define "prompt engineering" and it's the actual hard problem we have to solve. By naming the protocol the Model Context Protocol, I assumed they were solving prompt engineering problems (maybe by standardizing common prompting techniques like ReAct, CoT, etc).

By @punkpeye - 5 months
I took time to read everything on Twitter/Reddit/Documentation about this.

I think I have a complete picture.

Here is a quickstart for anyone who is just getting into it.

https://glama.ai/blog/2024-11-25-model-context-protocol-quic...

By @bionhoward - 5 months
I love how they’re pretending to be champions of open source while leaving this gem in their terms of use

“”” You may not access or use, or help another person to access or use, our Services in the following ways: … To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models. “””

By @WhatIsDukkha - 5 months
I don't understand the value of this abstraction.

I can see the value of something like DSPy where there is some higher level abstractions in wiring together a system of llms.

But this seems like an abstraction that doesn't really offer much besides "function calling but you use our python code".

I see the value of language server protocol but I don't see the mapping to this piece of code.

That's actually negative value if you are integrating into an existing software system or just you know... exposing functions that you've defined vs remapping functions you've defined into this intermediate abstraction.

By @keybits - 5 months
The Zed editor team collaborated with Anthropic on this, so you can try features of this in Zed as of today: https://zed.dev/blog/mcp
By @ssfrr - 5 months
I'm a little confused as to the fundamental problem statement. It seems like the idea is to create a protocol that can connect arbitrary applications to arbitrary resources, which seems underconstrained as a problem to solve.

This level of generality has been attempted before (e.g. RDF and the semantic web, REST, SOAP) and I'm not sure what's fundamentally different about how this problem is framed that makes it more tractable.

By @benocodes - 5 months
Good thread showing how this works: https://x.com/alexalbert__/status/1861079762506252723