- Problem Statement
- Solution Overview
- Use Case Scenarios
- System Architecture
- Design Process
- Implementation
- Features
- Installation
- Configuration
- Usage
- API Reference
- Development
- Testing
- Deployment
- Contributing
- License
Modern knowledge workers face significant inefficiencies in document management:
- Fragmented Ecosystems: Documents scattered across local storage, cloud platforms, and collaborative tools
- Limited AI Integration: Existing document systems don't provide seamless AI agent access
- Manual Sync Overhead: Constant manual synchronization between local work and cloud storage
- Context Loss: AI assistants can't access local documents for contextual assistance
- Version Control Issues: Difficulty tracking changes across multiple platforms
Traditional document management solutions fall short in the emerging AI-first workflow era:
- No Protocol Standardization: Lack of standardized protocols for AI-document interaction
- Platform Lock-in: Vendor-specific solutions that don't interoperate
- Limited Real-time Capabilities: Insufficient support for real-time AI collaboration
- Security Concerns: Inadequate access control for sensitive local documents
Documents MCP is a production-ready Model Context Protocol server that bridges local document management with cloud synchronization, specifically designed for AI agent integration.
Unified Interface: Single protocol for AI agents to access both local and cloud documents
Real-time Sync: Bidirectional synchronization with conflict resolution
Enterprise Security: Path validation, access control, and audit trails
Protocol Compliance: Full MCP specification implementation
Rich Metadata: Comprehensive file information and sync status tracking
Actor: Content Creator
Goal: Leverage AI assistance while maintaining local document control
Workflow:
- Writer maintains drafts locally for version control
- AI agent (Claude Desktop) accesses documents via MCP protocol
- Real-time suggestions and edits are synchronized to Notion
- Team collaboration occurs through cloud interface
- Final versions sync back to local storage
Benefits:
- Maintains local control while enabling cloud collaboration
- AI assistant has full context of writing projects
- Seamless version tracking across platforms
Actor: Software Development Team
Goal: Maintain synchronized technical documentation across environments
Workflow:
- Developers write documentation locally alongside code
- MCP server automatically syncs to shared Notion workspace
- AI agents help maintain consistency and completeness
- Non-technical stakeholders access via Notion interface
- Changes propagate bidirectionally with conflict resolution
Benefits:
- Documentation stays current with codebase
- Reduces documentation maintenance overhead
- Enables AI-assisted technical writing
Actor: Academic Researcher
Goal: Organize and analyze research documents with AI assistance
Workflow:
- Research files stored locally for security and offline access
- AI agent helps categorize and analyze document content
- Structured metadata synced to Notion for team visibility
- Search and discovery enhanced through AI understanding
- Publication-ready documents maintained locally
Benefits:
- Sensitive research data remains local
- AI-enhanced document analysis and organization
- Team collaboration without compromising data security
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ AI Agents │ │ Documents MCP │ │ Cloud Storage │
│ (Claude, etc.) │◄──►│ Server │◄──►│ (Notion) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐
│ Local File System│
└──────────────────┘
Documents MCP Server
├── MCP Protocol Layer
│ ├── JSON-RPC Handler
│ ├── Tool Registration
│ └── Request/Response Processing
├── Document Management
│ ├── File System Operations
│ ├── Metadata Extraction
│ ├── Search & Indexing
│ └── Security & Validation
├── Sync Engine
│ ├── Notion API Integration
│ ├── Conflict Resolution
│ ├── Status Tracking
│ └── Bidirectional Sync
└── Configuration & Utilities
├── Environment Management
├── Error Handling
└── Logging & Monitoring
Stakeholder Research:
- AI application developers needing document access
- Knowledge workers seeking AI-enhanced workflows
- Development teams requiring synchronized documentation
Technical Requirements:
- MCP protocol compliance for AI agent compatibility
- Secure local file system access with path validation
- Cloud synchronization with conflict resolution
- Rich metadata and search capabilities
Decision: Model Context Protocol (MCP)
- Rationale: Emerging standard for AI-system integration
- Benefits: Future-proofing, standardization, ecosystem compatibility
- Trade-offs: Early adoption risks, limited tooling
Alternative Considered: REST API
- Rejected: Less suitable for AI agent integration patterns
Modular Architecture Principles:
- Separation of Concerns: Distinct layers for protocol, documents, and sync
- Extensibility: Plugin architecture for additional cloud providers
- Testability: Isolated components for unit testing
- Security: Defense-in-depth with multiple validation layers
Core Technologies:
- TypeScript: Type safety, developer experience, ecosystem maturity
- Node.js: JavaScript ecosystem, npm packages, deployment flexibility
- Notion API: Rich collaboration features, extensive API capabilities
Development Tools:
- tsx: TypeScript execution for development and testing
- dotenv: Environment configuration management
- MCP SDK: Official protocol implementation libraries
Objective: Establish basic MCP server functionality
Implementation:
- JSON-RPC request/response handling
- Tool registration and discovery
- Basic document operations (list, read, write)
- Protocol compliance testing
Challenges Addressed:
- MCP specification interpretation
- TypeScript type definitions for protocol
- Request validation and error handling
Objective: Add production-ready file management capabilities
Implementation:
- Rich metadata extraction (size, dates, file types)
- Path security and validation
- File type detection and filtering
- Search functionality (name and content)
- Directory operations and traversal
Security Measures:
- Path traversal attack prevention
- File size limits and validation
- Type checking and sanitization
Objective: Implement bidirectional Notion sync with conflict resolution
Implementation:
- Notion API integration and authentication
- Database schema design and management
- Sync status tracking and monitoring
- Conflict detection and resolution algorithms
- Error handling and retry logic
Synchronization Logic:
- Compare local vs. cloud modification timestamps
- Detect and flag conflicts for user resolution
- Maintain sync history and audit trails
TypeScript Implementation:
- Strict type checking enabled
- Comprehensive interface definitions
- Generic types for reusability
- Proper error type handling
Testing Strategy:
- Manual JSON-RPC protocol testing
- Integration testing with Notion API
- File system operation validation
- Error condition testing
- List Documents: Recursive directory scanning with metadata
- Read Documents: Content extraction with encoding detection
- Write Documents: Atomic operations with backup support
- Search Documents: Full-text search across name and content
- Notion Integration: Bidirectional sync with rich metadata
- Conflict Resolution: Intelligent handling of concurrent changes
- Sync Status: Real-time monitoring of synchronization state
- Timestamp Tracking: Modification time comparison and validation
- Path Validation: Prevention of directory traversal attacks
- File Size Limits: Protection against resource exhaustion
- Type Detection: Safe handling of different file formats
- Efficient Operations: Optimized for large document collections
- Environment Configuration: Flexible setup via environment variables
- Rich Error Messages: Comprehensive error reporting and debugging
- TypeScript Support: Full type safety and IntelliSense
- Protocol Compliance: Adherence to MCP specifications
- Node.js 18+
- npm or yarn package manager
- TypeScript 5.0+
- Active Notion account (for cloud sync features)
# Clone the repository
git clone https://github.com/corneyc/documents-mcp.git
cd documents-mcp
# Install dependencies
npm install
# Configure environment
cp .env.example .env
# Edit .env with your configuration
# Build the project
npm run build
# Start the MCP server
npm run mcp
# Build Docker image
docker build -t documents-mcp .
# Run container
docker run -v $(pwd)/documents:/app/documents \
-e NOTION_TOKEN=your_token \
-e NOTION_DATABASE_ID=your_db_id \
documents-mcp
Create a .env
file in the project root:
# Notion Integration (Required for sync features)
NOTION_TOKEN=secret_your_notion_integration_token
NOTION_DATABASE_ID=your_database_uuid
# Document Storage (Optional)
DOCUMENTS_ROOT=/path/to/documents
# Server Configuration (Optional)
LOG_LEVEL=info
MAX_FILE_SIZE=10485760 # 10MB default
-
Create Integration:
- Visit https://www.notion.so/my-integrations
- Create new integration named "Documents MCP Sync"
- Copy the integration token
-
Create Database:
- Create new database in Notion
- Add required properties:
- Title (title type)
- Local Path (text type)
- File Size (number type)
- Last Modified (date type)
- Status (select type)
-
Share Database:
- Click Share on your database
- Add your integration
- Copy database ID from URL
Configure Claude Desktop to use your MCP server:
// ~/.claude-desktop/claude_desktop_config.json
{
"mcpServers": {
"documents-mcp": {
"command": "/opt/homebrew/bin/npm",
"args": ["run", "mcp"],
"cwd": "/path/to/documents-mcp"
}
}
}
Test server functionality with JSON-RPC calls:
# List documents
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "list_documents", "arguments": {}}}' | npm run mcp
# Read document
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "read_document", "arguments": {"key": "example.md"}}}' | npm run mcp
# Sync to Notion
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "sync_to_notion", "arguments": {"path": "example.md"}}}' | npm run mcp
Lists all documents with metadata filtering options.
Parameters:
includeDirectories
(boolean): Include directories in resultstextFilesOnly
(boolean): Filter to text files onlymaxSize
(number): Maximum file size in bytes
Response: Array of file metadata objects
Reads document content with metadata.
Parameters:
key
(string): Document path to read
Response: Document content and metadata
Writes content to document with options.
Parameters:
key
(string): Document path to writecontent
(string): Content to writecreateBackup
(boolean): Create backup before overwriteoverwrite
(boolean): Allow overwriting existing files
Response: Write operation result and metadata
Synchronizes local document to Notion.
Parameters:
path
(string): Local document path to sync
Response: Sync operation result and Notion page ID
Retrieves sync status for all documents.
Parameters: None
Response: Array of sync status objects
Searches documents by name or content.
Parameters:
query
(string): Search querysearchContent
(boolean): Search file contentcaseSensitive
(boolean): Case-sensitive searchfilePattern
(string): File name regex pattern
Response: Array of matching documents
documents-mcp/
├── src/
│ ├── mcp-server.ts # Main MCP server implementation
│ ├── utils/
│ │ ├── local-documents-enhanced.ts # Enhanced file operations
│ │ ├── local-documents.ts # Basic file operations
│ │ └── notion-sync.ts # Notion integration
│ └── types.ts # TypeScript definitions
├── documents/ # Default document storage
├── tests/ # Test files
├── package.json # Node.js configuration
├── tsconfig.json # TypeScript configuration
├── wrangler.toml # Cloudflare Workers config
└── README.md # Project documentation
# Install dependencies
npm install
# Start development server
npm run dev
# Run MCP server
npm run mcp
# Run tests
npm test
# Type checking
npm run type-check
# Build for production
npm run build
- Implement Tool Logic: Add function to appropriate utility module
- Register Tool: Add tool definition to MCP server
- Add Handler: Implement tool handler in CallToolRequestSchema
- Update Types: Add TypeScript interfaces
- Test Integration: Verify with JSON-RPC calls
The project includes comprehensive manual testing procedures:
# Test basic protocol functionality
./scripts/test-protocol.sh
# Test document operations
./scripts/test-documents.sh
# Test Notion integration
./scripts/test-notion.sh
# Test error handling
./scripts/test-errors.sh
Test with actual AI agents:
- Configure Claude Desktop with MCP server
- Test document listing and reading
- Verify Notion synchronization
- Test error handling and recovery
# Test with large document collections
./scripts/test-performance.sh
# Monitor memory usage
npm run monitor
# Benchmark sync operations
npm run benchmark
# Start server locally
npm run mcp
# Run in development mode with auto-restart
npm run dev
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY src/ ./src/
COPY tsconfig.json ./
RUN npm run build
EXPOSE 3000
CMD ["npm", "run", "mcp"]
Deploy the companion web API to Cloudflare Workers:
# Deploy to Cloudflare
npm run deploy
# Deploy to specific environment
npm run deploy:staging
Production environment configuration:
NODE_ENV=production
NOTION_TOKEN=secret_production_token
NOTION_DATABASE_ID=production_database_id
DOCUMENTS_ROOT=/app/documents
LOG_LEVEL=info
- Lazy Loading: Documents loaded only when accessed
- Caching: In-memory caching of frequently accessed files
- Batch Operations: Efficient bulk synchronization
- Connection Pooling: Optimized Notion API connections
- Horizontal Scaling: Multiple MCP server instances
- Load Balancing: Distribute document operations
- Database Sharding: Separate Notion databases by team/project
- CDN Integration: Cache static document content
- Fork Repository: Create personal fork of the project
- Feature Branch: Create branch for new feature or fix
- Implementation: Develop with comprehensive testing
- Documentation: Update documentation for changes
- Pull Request: Submit PR with detailed description
- TypeScript: Strict mode enabled with comprehensive typing
- ESLint: Code quality and consistency enforcement
- Prettier: Automated code formatting
- Conventional Commits: Standardized commit messages
- Unit Tests: All utility functions must have unit tests
- Integration Tests: MCP protocol compliance testing
- Documentation: All public APIs must be documented
MIT License - see LICENSE file for details.
- Anthropic: MCP protocol specification and SDK
- Notion: Comprehensive API and developer documentation
- TypeScript Team: Excellent tooling and type system
- Node.js Community: Rich ecosystem and package availability
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Project Wiki
- Google Drive integration
- Enhanced search with full-text indexing
- Automated testing suite
- Performance monitoring dashboard
- Multi-cloud sync support
- Real-time collaboration features
- Advanced conflict resolution UI
- Plugin architecture for extensions
- Enterprise authentication integration
- Advanced analytics and reporting
- Mobile app companion
- AI-powered document insights
Built with love for the AI-powered future of document management