Skip to content

Production-ready Model Context Protocol server enabling AI agents to manage local documents with Notion sync capabilities

Notifications You must be signed in to change notification settings

corneyc/documents-mcp

Repository files navigation

Documents MCP Server

Advanced Model Context Protocol Document Management System

TypeScript Node.js Notion Cloudflare Workers


Table of Contents


Problem Statement

The Challenge

Modern knowledge workers face significant inefficiencies in document management:

  • Fragmented Ecosystems: Documents scattered across local storage, cloud platforms, and collaborative tools
  • Limited AI Integration: Existing document systems don't provide seamless AI agent access
  • Manual Sync Overhead: Constant manual synchronization between local work and cloud storage
  • Context Loss: AI assistants can't access local documents for contextual assistance
  • Version Control Issues: Difficulty tracking changes across multiple platforms

Market Gap

Traditional document management solutions fall short in the emerging AI-first workflow era:

  • No Protocol Standardization: Lack of standardized protocols for AI-document interaction
  • Platform Lock-in: Vendor-specific solutions that don't interoperate
  • Limited Real-time Capabilities: Insufficient support for real-time AI collaboration
  • Security Concerns: Inadequate access control for sensitive local documents

Solution Overview

Documents MCP is a production-ready Model Context Protocol server that bridges local document management with cloud synchronization, specifically designed for AI agent integration.

Core Value Proposition

Unified Interface: Single protocol for AI agents to access both local and cloud documents
Real-time Sync: Bidirectional synchronization with conflict resolution
Enterprise Security: Path validation, access control, and audit trails
Protocol Compliance: Full MCP specification implementation
Rich Metadata: Comprehensive file information and sync status tracking


Use Case Scenarios

Scenario 1: AI-Powered Content Creation

Actor: Content Creator
Goal: Leverage AI assistance while maintaining local document control

Workflow:

  1. Writer maintains drafts locally for version control
  2. AI agent (Claude Desktop) accesses documents via MCP protocol
  3. Real-time suggestions and edits are synchronized to Notion
  4. Team collaboration occurs through cloud interface
  5. Final versions sync back to local storage

Benefits:

  • Maintains local control while enabling cloud collaboration
  • AI assistant has full context of writing projects
  • Seamless version tracking across platforms

Scenario 2: Technical Documentation Management

Actor: Software Development Team
Goal: Maintain synchronized technical documentation across environments

Workflow:

  1. Developers write documentation locally alongside code
  2. MCP server automatically syncs to shared Notion workspace
  3. AI agents help maintain consistency and completeness
  4. Non-technical stakeholders access via Notion interface
  5. Changes propagate bidirectionally with conflict resolution

Benefits:

  • Documentation stays current with codebase
  • Reduces documentation maintenance overhead
  • Enables AI-assisted technical writing

Scenario 3: Research Data Organization

Actor: Academic Researcher
Goal: Organize and analyze research documents with AI assistance

Workflow:

  1. Research files stored locally for security and offline access
  2. AI agent helps categorize and analyze document content
  3. Structured metadata synced to Notion for team visibility
  4. Search and discovery enhanced through AI understanding
  5. Publication-ready documents maintained locally

Benefits:

  • Sensitive research data remains local
  • AI-enhanced document analysis and organization
  • Team collaboration without compromising data security

System Architecture

High-Level Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   AI Agents     │    │  Documents MCP   │    │  Cloud Storage  │
│  (Claude, etc.) │◄──►│     Server       │◄──►│   (Notion)      │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                              │
                              ▼
                       ┌──────────────────┐
                       │ Local File System│
                       └──────────────────┘

Component Architecture

Documents MCP Server
├── MCP Protocol Layer
│   ├── JSON-RPC Handler
│   ├── Tool Registration
│   └── Request/Response Processing
├── Document Management
│   ├── File System Operations
│   ├── Metadata Extraction
│   ├── Search & Indexing
│   └── Security & Validation
├── Sync Engine
│   ├── Notion API Integration
│   ├── Conflict Resolution
│   ├── Status Tracking
│   └── Bidirectional Sync
└── Configuration & Utilities
    ├── Environment Management
    ├── Error Handling
    └── Logging & Monitoring

Design Process

Phase 1: Requirements Analysis

Stakeholder Research:

  • AI application developers needing document access
  • Knowledge workers seeking AI-enhanced workflows
  • Development teams requiring synchronized documentation

Technical Requirements:

  • MCP protocol compliance for AI agent compatibility
  • Secure local file system access with path validation
  • Cloud synchronization with conflict resolution
  • Rich metadata and search capabilities

Phase 2: Protocol Selection

Decision: Model Context Protocol (MCP)

  • Rationale: Emerging standard for AI-system integration
  • Benefits: Future-proofing, standardization, ecosystem compatibility
  • Trade-offs: Early adoption risks, limited tooling

Alternative Considered: REST API

  • Rejected: Less suitable for AI agent integration patterns

Phase 3: Architecture Design

Modular Architecture Principles:

  • Separation of Concerns: Distinct layers for protocol, documents, and sync
  • Extensibility: Plugin architecture for additional cloud providers
  • Testability: Isolated components for unit testing
  • Security: Defense-in-depth with multiple validation layers

Phase 4: Technology Stack Selection

Core Technologies:

  • TypeScript: Type safety, developer experience, ecosystem maturity
  • Node.js: JavaScript ecosystem, npm packages, deployment flexibility
  • Notion API: Rich collaboration features, extensive API capabilities

Development Tools:

  • tsx: TypeScript execution for development and testing
  • dotenv: Environment configuration management
  • MCP SDK: Official protocol implementation libraries

Implementation

Development Stages

Stage 1: Core MCP Protocol Implementation

Objective: Establish basic MCP server functionality

Implementation:

  • JSON-RPC request/response handling
  • Tool registration and discovery
  • Basic document operations (list, read, write)
  • Protocol compliance testing

Challenges Addressed:

  • MCP specification interpretation
  • TypeScript type definitions for protocol
  • Request validation and error handling

Stage 2: Enhanced Document Operations

Objective: Add production-ready file management capabilities

Implementation:

  • Rich metadata extraction (size, dates, file types)
  • Path security and validation
  • File type detection and filtering
  • Search functionality (name and content)
  • Directory operations and traversal

Security Measures:

  • Path traversal attack prevention
  • File size limits and validation
  • Type checking and sanitization

Stage 3: Cloud Synchronization

Objective: Implement bidirectional Notion sync with conflict resolution

Implementation:

  • Notion API integration and authentication
  • Database schema design and management
  • Sync status tracking and monitoring
  • Conflict detection and resolution algorithms
  • Error handling and retry logic

Synchronization Logic:

  • Compare local vs. cloud modification timestamps
  • Detect and flag conflicts for user resolution
  • Maintain sync history and audit trails

Code Quality and Testing

TypeScript Implementation:

  • Strict type checking enabled
  • Comprehensive interface definitions
  • Generic types for reusability
  • Proper error type handling

Testing Strategy:

  • Manual JSON-RPC protocol testing
  • Integration testing with Notion API
  • File system operation validation
  • Error condition testing

Features

Core Document Operations

  • List Documents: Recursive directory scanning with metadata
  • Read Documents: Content extraction with encoding detection
  • Write Documents: Atomic operations with backup support
  • Search Documents: Full-text search across name and content

Synchronization Capabilities

  • Notion Integration: Bidirectional sync with rich metadata
  • Conflict Resolution: Intelligent handling of concurrent changes
  • Sync Status: Real-time monitoring of synchronization state
  • Timestamp Tracking: Modification time comparison and validation

Security & Performance

  • Path Validation: Prevention of directory traversal attacks
  • File Size Limits: Protection against resource exhaustion
  • Type Detection: Safe handling of different file formats
  • Efficient Operations: Optimized for large document collections

Developer Experience

  • Environment Configuration: Flexible setup via environment variables
  • Rich Error Messages: Comprehensive error reporting and debugging
  • TypeScript Support: Full type safety and IntelliSense
  • Protocol Compliance: Adherence to MCP specifications

Installation

Prerequisites

  • Node.js 18+
  • npm or yarn package manager
  • TypeScript 5.0+
  • Active Notion account (for cloud sync features)

Quick Start

# Clone the repository
git clone https://github.com/corneyc/documents-mcp.git
cd documents-mcp

# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Edit .env with your configuration

# Build the project
npm run build

# Start the MCP server
npm run mcp

Docker Installation

# Build Docker image
docker build -t documents-mcp .

# Run container
docker run -v $(pwd)/documents:/app/documents \
  -e NOTION_TOKEN=your_token \
  -e NOTION_DATABASE_ID=your_db_id \
  documents-mcp

Configuration

Environment Variables

Create a .env file in the project root:

# Notion Integration (Required for sync features)
NOTION_TOKEN=secret_your_notion_integration_token
NOTION_DATABASE_ID=your_database_uuid

# Document Storage (Optional)
DOCUMENTS_ROOT=/path/to/documents

# Server Configuration (Optional)
LOG_LEVEL=info
MAX_FILE_SIZE=10485760  # 10MB default

Notion Setup

  1. Create Integration:

  2. Create Database:

    • Create new database in Notion
    • Add required properties:
      • Title (title type)
      • Local Path (text type)
      • File Size (number type)
      • Last Modified (date type)
      • Status (select type)
  3. Share Database:

    • Click Share on your database
    • Add your integration
    • Copy database ID from URL

Usage

Claude Desktop Integration

Configure Claude Desktop to use your MCP server:

// ~/.claude-desktop/claude_desktop_config.json
{
  "mcpServers": {
    "documents-mcp": {
      "command": "/opt/homebrew/bin/npm",
      "args": ["run", "mcp"],
      "cwd": "/path/to/documents-mcp"
    }
  }
}

Manual Testing

Test server functionality with JSON-RPC calls:

# List documents
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "list_documents", "arguments": {}}}' | npm run mcp

# Read document
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "read_document", "arguments": {"key": "example.md"}}}' | npm run mcp

# Sync to Notion
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "sync_to_notion", "arguments": {"path": "example.md"}}}' | npm run mcp

API Reference

MCP Tools

list_documents

Lists all documents with metadata filtering options.

Parameters:

  • includeDirectories (boolean): Include directories in results
  • textFilesOnly (boolean): Filter to text files only
  • maxSize (number): Maximum file size in bytes

Response: Array of file metadata objects

read_document

Reads document content with metadata.

Parameters:

  • key (string): Document path to read

Response: Document content and metadata

write_document

Writes content to document with options.

Parameters:

  • key (string): Document path to write
  • content (string): Content to write
  • createBackup (boolean): Create backup before overwrite
  • overwrite (boolean): Allow overwriting existing files

Response: Write operation result and metadata

sync_to_notion

Synchronizes local document to Notion.

Parameters:

  • path (string): Local document path to sync

Response: Sync operation result and Notion page ID

get_sync_status

Retrieves sync status for all documents.

Parameters: None

Response: Array of sync status objects

search_documents

Searches documents by name or content.

Parameters:

  • query (string): Search query
  • searchContent (boolean): Search file content
  • caseSensitive (boolean): Case-sensitive search
  • filePattern (string): File name regex pattern

Response: Array of matching documents


Development

Project Structure

documents-mcp/
├── src/
│   ├── mcp-server.ts              # Main MCP server implementation
│   ├── utils/
│   │   ├── local-documents-enhanced.ts  # Enhanced file operations
│   │   ├── local-documents.ts          # Basic file operations
│   │   └── notion-sync.ts             # Notion integration
│   └── types.ts                       # TypeScript definitions
├── documents/                         # Default document storage
├── tests/                            # Test files
├── package.json                      # Node.js configuration
├── tsconfig.json                     # TypeScript configuration
├── wrangler.toml                     # Cloudflare Workers config
└── README.md                         # Project documentation

Development Commands

# Install dependencies
npm install

# Start development server
npm run dev

# Run MCP server
npm run mcp

# Run tests
npm test

# Type checking
npm run type-check

# Build for production
npm run build

Adding New Features

  1. Implement Tool Logic: Add function to appropriate utility module
  2. Register Tool: Add tool definition to MCP server
  3. Add Handler: Implement tool handler in CallToolRequestSchema
  4. Update Types: Add TypeScript interfaces
  5. Test Integration: Verify with JSON-RPC calls

Testing

Manual Testing

The project includes comprehensive manual testing procedures:

# Test basic protocol functionality
./scripts/test-protocol.sh

# Test document operations
./scripts/test-documents.sh

# Test Notion integration
./scripts/test-notion.sh

# Test error handling
./scripts/test-errors.sh

Integration Testing

Test with actual AI agents:

  1. Configure Claude Desktop with MCP server
  2. Test document listing and reading
  3. Verify Notion synchronization
  4. Test error handling and recovery

Performance Testing

# Test with large document collections
./scripts/test-performance.sh

# Monitor memory usage
npm run monitor

# Benchmark sync operations
npm run benchmark

Deployment

Local Development

# Start server locally
npm run mcp

# Run in development mode with auto-restart
npm run dev

Production Deployment

Docker Deployment

FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY src/ ./src/
COPY tsconfig.json ./

RUN npm run build

EXPOSE 3000
CMD ["npm", "run", "mcp"]

Cloudflare Workers

Deploy the companion web API to Cloudflare Workers:

# Deploy to Cloudflare
npm run deploy

# Deploy to specific environment
npm run deploy:staging

Environment Setup

Production environment configuration:

NODE_ENV=production
NOTION_TOKEN=secret_production_token
NOTION_DATABASE_ID=production_database_id
DOCUMENTS_ROOT=/app/documents
LOG_LEVEL=info

Performance Considerations

Optimization Strategies

  • Lazy Loading: Documents loaded only when accessed
  • Caching: In-memory caching of frequently accessed files
  • Batch Operations: Efficient bulk synchronization
  • Connection Pooling: Optimized Notion API connections

Scaling Recommendations

  • Horizontal Scaling: Multiple MCP server instances
  • Load Balancing: Distribute document operations
  • Database Sharding: Separate Notion databases by team/project
  • CDN Integration: Cache static document content

Contributing

Development Process

  1. Fork Repository: Create personal fork of the project
  2. Feature Branch: Create branch for new feature or fix
  3. Implementation: Develop with comprehensive testing
  4. Documentation: Update documentation for changes
  5. Pull Request: Submit PR with detailed description

Code Standards

  • TypeScript: Strict mode enabled with comprehensive typing
  • ESLint: Code quality and consistency enforcement
  • Prettier: Automated code formatting
  • Conventional Commits: Standardized commit messages

Testing Requirements

  • Unit Tests: All utility functions must have unit tests
  • Integration Tests: MCP protocol compliance testing
  • Documentation: All public APIs must be documented

License

MIT License - see LICENSE file for details.


Acknowledgments

  • Anthropic: MCP protocol specification and SDK
  • Notion: Comprehensive API and developer documentation
  • TypeScript Team: Excellent tooling and type system
  • Node.js Community: Rich ecosystem and package availability

Support


Roadmap

Short Term (Q1 2025)

  • Google Drive integration
  • Enhanced search with full-text indexing
  • Automated testing suite
  • Performance monitoring dashboard

Medium Term (Q2 2025)

  • Multi-cloud sync support
  • Real-time collaboration features
  • Advanced conflict resolution UI
  • Plugin architecture for extensions

Long Term (Q3-Q4 2025)

  • Enterprise authentication integration
  • Advanced analytics and reporting
  • Mobile app companion
  • AI-powered document insights

Built with love for the AI-powered future of document management

About

Production-ready Model Context Protocol server enabling AI agents to manage local documents with Notion sync capabilities

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published