-
Couldn't load subscription status.
- Fork 7
Feature/drel 817 migrate mcp from ipfc cids to npm pkg names #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/drel 817 migrate mcp from ipfc cids to npm pkg names #195
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
…ild usage as npx will include npm and the final pkg won't have all the pnpm and nx ecosystem. Added a installed pkgs cache to avoid repeating the installation on subsequent mcp connections
…-ipfc-cids-to-npm-pkg-names # Conflicts: # pnpm-lock.yaml
… mcp server creation
| ], | ||
| }; | ||
| }; | ||
| return `${toolName}/${vincentAppDef.name}/${vincentAppDef.version}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this was that the LLM client can have multiple MCP servers with the same tools. And that can bring issues as the LLM might not handle that well, there is no standard for that yet
A likely scenario for this is having complimentary apps that can share a tool (erc20 approval for example with an aave and uniswap servers). Multiple versions of the same app could also happen bypassing that you can have only one (but using two different users/jwt you could)
| args: ZodRawShape, | ||
| extra: RequestHandlerExtra<ServerRequest, ServerNotification> | ||
| ): Promise<CallToolResult> => { | ||
| const precheckResult = await toolClient.precheck(args, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. The first problem is that support for resources in LLM clients is very limited.
And, even some that support it, want to use ALL resources first, then ALL tools. That is actually how they are used, they are client controlled and supposed to expand the context, they are not triggered by the LLM.
If we make precheck a different tool then the LLM might ignore it or not know how to use it. It must be told that it has to execute it before the actual tool call, this makes LLM guidance much harder and variable
For what I investigated, people don't like resources that much for those reasons and commonly implement everything as tools. And the few use cases that they really make sense, are things like tool-generated outputs or real-time logs which the mcp server really controls (logs, screenshots, etc.)
What really makes sense to have as resources are the tools to fetch the delegator eth address and the application info here, but those are critical to tell the LLM who is operating, so I had to choose the option that maximized compatibility
I think a lot of this confusion is because people think of them as GET requests the LLM can make. And they are client controlled context expansions, not actions
packages/libs/mcp-sdk/src/server.ts
Outdated
| const precheckResult = await toolClient.precheck(args, { | ||
| delegatorPkpEthAddress: delegatorPkpEthAddress!, | ||
| }); | ||
| if (!precheckResult.success) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a better check to fail the tool in any case, having error or !success
I don't think we should care about the tool output being "correct". If that validation fails then we will never know because it will be hidden under a failing schema error
On the contrary, passing the raw result to the LLM delegates to it how to respond to it. The LLM doesn't really care about the shape of the data and can understand it on every case. We would just be hiding information that the LLM can find useful
packages/libs/mcp-sdk/src/server.ts
Outdated
| await this.litNodeClient.connect(); | ||
| } | ||
| // Add available descriptions to each param | ||
| toolData.parameters?.forEach((param) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Added in the tools themselves. The json now overrides them if provided. We cannot guarantee the tool developer writes those description and, even less, are useful for the app dev use case
- Doing that we would have to mutate the tool, or copy around a bunch of nested properties to reach the params schema and rebuild the tool again when passing it. And we still need the app def for other reasons. Consumers don't have to copy anything really, and if they need to override, they will likely take the time to try to do it correctly which is more complex than copying
…the latter as restrictive as it should be
…to-npm-pkg-names # Conflicts: # pnpm-lock.yaml
…urces for llm clients that support it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic PR -- good progress! Added comments and identified some issues in a few places. We should sync up and talk through the result handling to be sure we're being sane about how we're handling providing results
| - Optional: Copy `.env.example` to `.env` and fill in the values. Use absolute paths for the `VINCENT_APP_JSON_DEFINITION` value. You can also pass them via CLI arguments when calling the script. | ||
|
|
||
| # Writing App definition JSON file | ||
| # Writing App definition overrides in a JSON file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it clear what the behaviour will be if you define a JSON file and an environment var, and they have different values? Does the JSON file override the .env, or the other way around? Or will the MCP server refuse to start because it can't be sure what to do?
| ); | ||
| } | ||
|
|
||
| const vincentAppVersion = Number(appVersion) || registryApp.activeVersion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an explicit appVersion is provided rather than using the activeVersion, it is possible that the appVersion may exist in the registry off-chain store -- but still not be published on-chain.
We'll need to check on-chain to see if the appVersion is deployed using getAppVersion() method exported from the vincent-contracts-sdk -- here's an example of that being used in the registry backend to ensure that an appVersion that is on the registry but not yet published on-chain cannot be set to be the activeVersion on the app.
| if (jsonData.id && VINCENT_APP_ID && jsonData.id !== VINCENT_APP_ID) { | ||
| console.warn( | ||
| `The Vincent App Id specified in the environment variable VINCENT_APP_ID (${VINCENT_APP_ID}) does not match the Id in ${VINCENT_APP_JSON_DEFINITION} (${jsonData.id}). Using the Id from the file...`, | ||
| ); | ||
| } | ||
| if (jsonData.id && VINCENT_APP_VERSION && jsonData.version !== VINCENT_APP_VERSION) { | ||
| console.warn( | ||
| `The Vincent App version specified in the environment variable VINCENT_APP_VERSION (${VINCENT_APP_VERSION}) does not match the version in ${VINCENT_APP_JSON_DEFINITION} (${jsonData.version}). Using the version from the file...`, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Needing these warnings feels like a code smell -- it's sub-optimal to have two sources of truth for the same thing and in this case we also have tightly coupled data across both contexts 🤔
Given that we require that some data is provided exclusively via vars to use the MCP server at all, could we reduce the API of the json file to only allow augmenting description for tools and eliminate this ambiguity? I understand we wouldn't put long-form description overrides for tools in the env, but it feels like making the basic mcp server configuration entirely env-based would make this a lot less brittle 🤔
It's worth noting that the VINCENT_DELEGATEE_PRIVATE_KEY _must match the appId_ -- you can not have multiple apps with the same key -- so by splitting this up we're creating a case where someone can update the json file configuration to a different app ID than their .env file contains a credential for, that json file then takes precedence over the env vars they define, and they don't know that things are broken until they try to use a tool with the wrong delegatee key and get a cryptic message from the tool execution. we can add logic to check for that case -- e.g. query the app data and throw an error if the delegatee key doesn't match the appId -- and we probably should anyway, to give a fail-fast verbose error for this case -- but it feels like needing to change app configuration in two different files and/or the environment vars plus a json file, and be sure they are all synchronized is tricky -- we're splitting what is in effect tightly-coupled data across multiple places and asking the dev to keep it in sync across two contexts when they could just define all app configuration in env, and only tool description metadata in an optional json file -- then all env vars become required, and all json data becomes optional.
WDYT? I think it would simplify the mental model a lot.
| */ | ||
| export async function getVincentAppDef(): Promise<VincentAppDef> { | ||
| // Load data from the App definition JSON | ||
| const appJson = VINCENT_APP_JSON_DEFINITION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I assume that this always relative to the directory that the user ran the
npxcommand in -- not the directory containing the.envfile, is that right? 🤔 - We should be using
path.resolve()to compose this path string so it'll work consistently across POSIX and win32
| * | ||
| * @hidden | ||
| */ | ||
| export type BundledVincentTool = _BundledVincentTool< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why we couldn't use the BundledVincentTool type that is exported from the tool-sdk directly -- a reason why it's duplicated here in this comment would be good for future reference
| args: ZodRawShape, | ||
| extra: RequestHandlerExtra<ServerRequest, ServerNotification> | ||
| ): Promise<CallToolResult> => { | ||
| const precheckResult = await toolClient.precheck(args, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear where you're coming from -- but there are a couple of problems with this that I think are pretty fundamental concerns:
- The precheck() method returns a specific payload shape that contains data about the precheck -- depending on the tool and policies, that data may be directly relevant to the caller of the tool; they might use the result of precheck() to actually make decisions about arguments to pass to the execute()
- By nesting the call to precheck() here and ignoring the result unless it's an error, we're robbing the consumer of the context that precheck() responses would provide to them; effectively we've made a first-class concept (precheck for running a tool) an internal detail that the consumer can no longer be aware of unless it fails
There's no point in returning the precheck result after we've run the execute(), and the schema validation in the MCP server would throw an error if we tried to:
https://github.com/modelcontextprotocol/typescript-sdk/blob/400b020c854d31112c8f29a2e280072731ed3d5f/src/server/mcp.ts#L212C55-L212C69
From the LLM consumer's perspective, if the precheck() method isn't a resource, then it could be considered as it's own tool -- we could add an annotation that indicates it's a 'pure tool' since the MCP server supports annotations for that purpose -- the "precheck tool" informs it about whether it can call the tool and expect it to work -- and also it can provide context by way of the precheck response on how it might call another tool -- which is the VincentTool's execute() method.
packages/libs/mcp-sdk/src/server.ts
Outdated
| await this.litNodeClient.connect(); | ||
| } | ||
| // Add available descriptions to each param | ||
| toolData.parameters?.forEach((param) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re #2 -- we are mutating the tool which we were passed in here right now - that's my point; if I have a VincentTool object with the tool-developer's schemas on it, and I pass it into this server, then suddenly all of the describe()s for that BundledVincentTool's schema objects will be changed -- we're assigning back into the argument provided.
If anything is going to mutate the schemas on the vincentTool (we shouldn't, IMO), we should have that happen as far 'up' the stack as possible.
Since the server app is really the only environment that can be reasonably expected to own all of its tool definitions data, mutating tools it has loaded is one thing -- but this MCP SDK code could be running in a context where it could be running both an MCP server and it could be using that tool for other purposes --and having the tool be mutated by passing it into the MCP server constructor is not intuitive and could cause weird, hard to understand bugs. Overwriting descriptions using JSON files on the filesystem is a very specific concern for 'I have a commandline npx server and need to inject descriptions' -- the code in the mcp-sdk is just plain JS code that isn't opinionated about loading all of its context from the filesystem or the registry -- we need the SDK itself to be decoupled from the strong opinions that the "just run NPX" have imposed on the design of the code in that app.
packages/libs/mcp-sdk/src/server.ts
Outdated
| const precheckResult = await toolClient.precheck(args, { | ||
| delegatorPkpEthAddress: delegatorPkpEthAddress!, | ||
| }); | ||
| if (!precheckResult.success) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it should ever be OK for us not to worry about whether the tool output is correct or not -- the MCP framework is strongly opinionated about this, and the LLM is instructed in what the shape of the responses it should expect will be and what they mean by the JSONSchema that is generated based on the outputSchema -- if we just return whatever we feel like, we're providing garbage output that doesn't match the world view that we have provided the LLM by giving it structured data inputs and outputs for the tool -- how it reacts at that point is unpredictable. Also, Vincent tools always return structured data that is passed through their response ZOD schemas. Responses that don't match the result schema are transformed into success:false with a ZOD error in the result 🤔
-
Critically important here is that If the precheck() or execute() methods throw an error, or if the tool has no
failSchemaand the tool author callsfail(<string>)in the execute() or precheck() methods, thensuccesswill befalse,resultwill be empty, anderrorwill have a a string indicating what went wrong. Right now we'd be throwing an error with a stringifiedundefined -
Unless I am missing something, you can't have an outputSchema but just return
content-- it'll be converted into an error by the MCP server. That's why I linked to it here:
https://github.com/modelcontextprotocol/typescript-sdk/blob/400b020c854d31112c8f29a2e280072731ed3d5f/src/server/mcp.ts#L203-L224
Ahh -- I just realized -- are we not registering any output schemas at all? So VincentTools won't return structured data, even though we've defined explicit structured results output schemas complete with context for the LLM to understand what the responses mean?
If an LLM calls ListTools, the outputSchema is serialized in JSONSchema form to the LLM, and it is what tells the LLM what the result of the tools can be and gives them description entries for the results -- by not using the schemas we are hiding information from the LLM that it would find useful.
If a VincentTool returns data that doesn't match the outputSchema it will already be converted into a zodSchema error response by the VincentTool wrapper code -- we validate inputs and responses using the same schemas inside the tool execution environment for both precheck and execute() and don't return malformed data back to the caller -- any badly formed parameter inputs or lifecycle method responses from a tool will never include anything other than a "schema parsing failed" error result. 🤔
We should probably sync up and talk this through -- handling of inputs and outputs and how the MCP server and vincent tools interact is probably the most critical part of the entire MCP implementation.
Description
This PR changes how the Vincent MCP server gets built. Now the JSON app definition file is completely optional and only used for overrides. Main source of truth is the Vincent Registry, from where tools npm packages are fetched.
The tools npm pkgs are installed on MCP server startup and dynamically imported.
Benefits:
npxcommand and just passing it the app id (optionally also the version, will default to latest from the registry) and delegatee private keyVincentToolClientusage with properprecheckandexecutecalls. Only built on the mcp app side, sdk does not require stuff anymoreType of change
How Has This Been Tested?
npxcommandsAll was done using a local instance of the registry
Checklist:
nx release plan) describing my changes and the version bump