Skip to content

Conversation

@FedericoAmura
Copy link
Contributor

@FedericoAmura FedericoAmura commented Jul 3, 2025

Description

This PR changes how the Vincent MCP server gets built. Now the JSON app definition file is completely optional and only used for overrides. Main source of truth is the Vincent Registry, from where tools npm packages are fetched.

The tools npm pkgs are installed on MCP server startup and dynamically imported.

Benefits:

  • Any Vincent App in the registry will be easily run as an MCP server with an npx command and just passing it the app id (optionally also the version, will default to latest from the registry) and delegatee private key
  • Any field exposed to the LLM client can be overrided and provide extra info on how it should be used
  • Full VincentToolClient usage with proper precheck and execute calls. Only built on the mcp app side, sdk does not require stuff anymore
  • Dynamic dependencies building, users don't need to install tool pkgs previously on the execution environment
  • Separated environment definitions for stdio and http transports with their requirements properly enforced

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • run MCP server locally (both stdio and http)
  • Deployed MCP in NPM and tested it using npx commands

All was done using a local instance of the registry

Checklist:

  • I created a release plan (nx release plan) describing my changes and the version bump
  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@FedericoAmura FedericoAmura self-assigned this Jul 3, 2025
@vercel
Copy link

vercel bot commented Jul 3, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
vincent-app-dashboard ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 14, 2025 9:32am
vincent-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 14, 2025 9:32am

…ild usage as npx will include npm and the final pkg won't have all the pnpm and nx ecosystem. Added a installed pkgs cache to avoid repeating the installation on subsequent mcp connections
],
};
};
return `${toolName}/${vincentAppDef.name}/${vincentAppDef.version}`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this was that the LLM client can have multiple MCP servers with the same tools. And that can bring issues as the LLM might not handle that well, there is no standard for that yet

A likely scenario for this is having complimentary apps that can share a tool (erc20 approval for example with an aave and uniswap servers). Multiple versions of the same app could also happen bypassing that you can have only one (but using two different users/jwt you could)

args: ZodRawShape,
extra: RequestHandlerExtra<ServerRequest, ServerNotification>
): Promise<CallToolResult> => {
const precheckResult = await toolClient.precheck(args, {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The first problem is that support for resources in LLM clients is very limited.
And, even some that support it, want to use ALL resources first, then ALL tools. That is actually how they are used, they are client controlled and supposed to expand the context, they are not triggered by the LLM.

If we make precheck a different tool then the LLM might ignore it or not know how to use it. It must be told that it has to execute it before the actual tool call, this makes LLM guidance much harder and variable

For what I investigated, people don't like resources that much for those reasons and commonly implement everything as tools. And the few use cases that they really make sense, are things like tool-generated outputs or real-time logs which the mcp server really controls (logs, screenshots, etc.)

What really makes sense to have as resources are the tools to fetch the delegator eth address and the application info here, but those are critical to tell the LLM who is operating, so I had to choose the option that maximized compatibility

I think a lot of this confusion is because people think of them as GET requests the LLM can make. And they are client controlled context expansions, not actions

const precheckResult = await toolClient.precheck(args, {
delegatorPkpEthAddress: delegatorPkpEthAddress!,
});
if (!precheckResult.success) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a better check to fail the tool in any case, having error or !success

I don't think we should care about the tool output being "correct". If that validation fails then we will never know because it will be hidden under a failing schema error
On the contrary, passing the raw result to the LLM delegates to it how to respond to it. The LLM doesn't really care about the shape of the data and can understand it on every case. We would just be hiding information that the LLM can find useful

await this.litNodeClient.connect();
}
// Add available descriptions to each param
toolData.parameters?.forEach((param) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Added in the tools themselves. The json now overrides them if provided. We cannot guarantee the tool developer writes those description and, even less, are useful for the app dev use case
  2. Doing that we would have to mutate the tool, or copy around a bunch of nested properties to reach the params schema and rebuild the tool again when passing it. And we still need the app def for other reasons. Consumers don't have to copy anything really, and if they need to override, they will likely take the time to try to do it correctly which is more complex than copying

Copy link
Contributor

@MaximusHaximus MaximusHaximus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic PR -- good progress! Added comments and identified some issues in a few places. We should sync up and talk through the result handling to be sure we're being sane about how we're handling providing results

- Optional: Copy `.env.example` to `.env` and fill in the values. Use absolute paths for the `VINCENT_APP_JSON_DEFINITION` value. You can also pass them via CLI arguments when calling the script.

# Writing App definition JSON file
# Writing App definition overrides in a JSON file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it clear what the behaviour will be if you define a JSON file and an environment var, and they have different values? Does the JSON file override the .env, or the other way around? Or will the MCP server refuse to start because it can't be sure what to do?

);
}

const vincentAppVersion = Number(appVersion) || registryApp.activeVersion;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an explicit appVersion is provided rather than using the activeVersion, it is possible that the appVersion may exist in the registry off-chain store -- but still not be published on-chain.

We'll need to check on-chain to see if the appVersion is deployed using getAppVersion() method exported from the vincent-contracts-sdk -- here's an example of that being used in the registry backend to ensure that an appVersion that is on the registry but not yet published on-chain cannot be set to be the activeVersion on the app.

Comment on lines +244 to +253
if (jsonData.id && VINCENT_APP_ID && jsonData.id !== VINCENT_APP_ID) {
console.warn(
`The Vincent App Id specified in the environment variable VINCENT_APP_ID (${VINCENT_APP_ID}) does not match the Id in ${VINCENT_APP_JSON_DEFINITION} (${jsonData.id}). Using the Id from the file...`,
);
}
if (jsonData.id && VINCENT_APP_VERSION && jsonData.version !== VINCENT_APP_VERSION) {
console.warn(
`The Vincent App version specified in the environment variable VINCENT_APP_VERSION (${VINCENT_APP_VERSION}) does not match the version in ${VINCENT_APP_JSON_DEFINITION} (${jsonData.version}). Using the version from the file...`,
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Needing these warnings feels like a code smell -- it's sub-optimal to have two sources of truth for the same thing and in this case we also have tightly coupled data across both contexts 🤔

Given that we require that some data is provided exclusively via vars to use the MCP server at all, could we reduce the API of the json file to only allow augmenting description for tools and eliminate this ambiguity? I understand we wouldn't put long-form description overrides for tools in the env, but it feels like making the basic mcp server configuration entirely env-based would make this a lot less brittle 🤔

It's worth noting that the VINCENT_DELEGATEE_PRIVATE_KEY _must match the appId_ -- you can not have multiple apps with the same key -- so by splitting this up we're creating a case where someone can update the json file configuration to a different app ID than their .env file contains a credential for, that json file then takes precedence over the env vars they define, and they don't know that things are broken until they try to use a tool with the wrong delegatee key and get a cryptic message from the tool execution. we can add logic to check for that case -- e.g. query the app data and throw an error if the delegatee key doesn't match the appId -- and we probably should anyway, to give a fail-fast verbose error for this case -- but it feels like needing to change app configuration in two different files and/or the environment vars plus a json file, and be sure they are all synchronized is tricky -- we're splitting what is in effect tightly-coupled data across multiple places and asking the dev to keep it in sync across two contexts when they could just define all app configuration in env, and only tool description metadata in an optional json file -- then all env vars become required, and all json data becomes optional.

WDYT? I think it would simplify the mental model a lot.

*/
export async function getVincentAppDef(): Promise<VincentAppDef> {
// Load data from the App definition JSON
const appJson = VINCENT_APP_JSON_DEFINITION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I assume that this always relative to the directory that the user ran the npx command in -- not the directory containing the .env file, is that right? 🤔
  2. We should be using path.resolve() to compose this path string so it'll work consistently across POSIX and win32

*
* @hidden
*/
export type BundledVincentTool = _BundledVincentTool<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why we couldn't use the BundledVincentTool type that is exported from the tool-sdk directly -- a reason why it's duplicated here in this comment would be good for future reference

args: ZodRawShape,
extra: RequestHandlerExtra<ServerRequest, ServerNotification>
): Promise<CallToolResult> => {
const precheckResult = await toolClient.precheck(args, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear where you're coming from -- but there are a couple of problems with this that I think are pretty fundamental concerns:

  • The precheck() method returns a specific payload shape that contains data about the precheck -- depending on the tool and policies, that data may be directly relevant to the caller of the tool; they might use the result of precheck() to actually make decisions about arguments to pass to the execute()
  • By nesting the call to precheck() here and ignoring the result unless it's an error, we're robbing the consumer of the context that precheck() responses would provide to them; effectively we've made a first-class concept (precheck for running a tool) an internal detail that the consumer can no longer be aware of unless it fails

There's no point in returning the precheck result after we've run the execute(), and the schema validation in the MCP server would throw an error if we tried to:
https://github.com/modelcontextprotocol/typescript-sdk/blob/400b020c854d31112c8f29a2e280072731ed3d5f/src/server/mcp.ts#L212C55-L212C69

From the LLM consumer's perspective, if the precheck() method isn't a resource, then it could be considered as it's own tool -- we could add an annotation that indicates it's a 'pure tool' since the MCP server supports annotations for that purpose -- the "precheck tool" informs it about whether it can call the tool and expect it to work -- and also it can provide context by way of the precheck response on how it might call another tool -- which is the VincentTool's execute() method.

await this.litNodeClient.connect();
}
// Add available descriptions to each param
toolData.parameters?.forEach((param) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re #2 -- we are mutating the tool which we were passed in here right now - that's my point; if I have a VincentTool object with the tool-developer's schemas on it, and I pass it into this server, then suddenly all of the describe()s for that BundledVincentTool's schema objects will be changed -- we're assigning back into the argument provided.

If anything is going to mutate the schemas on the vincentTool (we shouldn't, IMO), we should have that happen as far 'up' the stack as possible.

Since the server app is really the only environment that can be reasonably expected to own all of its tool definitions data, mutating tools it has loaded is one thing -- but this MCP SDK code could be running in a context where it could be running both an MCP server and it could be using that tool for other purposes --and having the tool be mutated by passing it into the MCP server constructor is not intuitive and could cause weird, hard to understand bugs. Overwriting descriptions using JSON files on the filesystem is a very specific concern for 'I have a commandline npx server and need to inject descriptions' -- the code in the mcp-sdk is just plain JS code that isn't opinionated about loading all of its context from the filesystem or the registry -- we need the SDK itself to be decoupled from the strong opinions that the "just run NPX" have imposed on the design of the code in that app.

const precheckResult = await toolClient.precheck(args, {
delegatorPkpEthAddress: delegatorPkpEthAddress!,
});
if (!precheckResult.success) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it should ever be OK for us not to worry about whether the tool output is correct or not -- the MCP framework is strongly opinionated about this, and the LLM is instructed in what the shape of the responses it should expect will be and what they mean by the JSONSchema that is generated based on the outputSchema -- if we just return whatever we feel like, we're providing garbage output that doesn't match the world view that we have provided the LLM by giving it structured data inputs and outputs for the tool -- how it reacts at that point is unpredictable. Also, Vincent tools always return structured data that is passed through their response ZOD schemas. Responses that don't match the result schema are transformed into success:false with a ZOD error in the result 🤔

  • Critically important here is that If the precheck() or execute() methods throw an error, or if the tool has no failSchema and the tool author calls fail(<string>) in the execute() or precheck() methods, then success will be false, result will be empty, and error will have a a string indicating what went wrong. Right now we'd be throwing an error with a stringified undefined

  • Unless I am missing something, you can't have an outputSchema but just return content -- it'll be converted into an error by the MCP server. That's why I linked to it here:
    https://github.com/modelcontextprotocol/typescript-sdk/blob/400b020c854d31112c8f29a2e280072731ed3d5f/src/server/mcp.ts#L203-L224

Ahh -- I just realized -- are we not registering any output schemas at all? So VincentTools won't return structured data, even though we've defined explicit structured results output schemas complete with context for the LLM to understand what the responses mean?

If an LLM calls ListTools, the outputSchema is serialized in JSONSchema form to the LLM, and it is what tells the LLM what the result of the tools can be and gives them description entries for the results -- by not using the schemas we are hiding information from the LLM that it would find useful.

If a VincentTool returns data that doesn't match the outputSchema it will already be converted into a zodSchema error response by the VincentTool wrapper code -- we validate inputs and responses using the same schemas inside the tool execution environment for both precheck and execute() and don't return malformed data back to the caller -- any badly formed parameter inputs or lifecycle method responses from a tool will never include anything other than a "schema parsing failed" error result. 🤔

We should probably sync up and talk this through -- handling of inputs and outputs and how the MCP server and vincent tools interact is probably the most critical part of the entire MCP implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants