Cloudflare Backup Sync Setup

Cloudflare Backup Sync Setup Guide

Overview

This guide shows how to set up automated daily backups of your local MCP Memory Service database to Cloudflare D1 for global access and disaster recovery.

Architecture: Keep your local SQLite-vec service as primary (fast) and sync to Cloudflare D1 as backup (global access).

Prerequisites

Cloudflare Account with Workers/D1 access
Local MCP Memory Service running with SQLite-vec backend
Python 3 (avoid uv environment to prevent unnecessary downloads)

Step 1: Create Cloudflare Resources

Option A: Using Cloudflare Dashboard (Recommended)

Create D1 Database:
- Go to Cloudflare Dashboard → D1
- Click "Create Database"
- Name: your-project-memory-backup
- Copy the Database ID from the overview page
Create Vectorize Index (Optional - for semantic search):
- Go to Cloudflare Dashboard → Vectorize
- Click "Create Index"
- Name: your-project-memory-backup
- Dimensions: 768
- Metric: cosine
Get Account ID:
- Go to Cloudflare Dashboard
- Copy Account ID from the right sidebar

Option B: Using Wrangler CLI

# Install Wrangler
npm install -g wrangler
wrangler login

# Create resources
wrangler d1 create your-project-memory-backup
wrangler vectorize create your-project-memory-backup --dimensions=768 --metric=cosine

Step 2: Create API Token

Go to Cloudflare Dashboard → My Profile → API Tokens
Click "Create Token" → "Custom Token"
Configure permissions:
- Account: Read
- D1: Edit
- Vectorize: Edit (if using)
- Workers AI: Read (if using)
Save the token securely

Step 3: Configure Environment

Create .env.cloudflare in your MCP Memory Service directory:

# Replace with your actual values
CLOUDFLARE_API_TOKEN=your-api-token-here
CLOUDFLARE_ACCOUNT_ID=your-account-id-here  
CLOUDFLARE_D1_DATABASE_ID=your-d1-database-id-here
CLOUDFLARE_VECTORIZE_INDEX=your-vectorize-index-name

Security Note: Keep this file secure and never commit to version control.

Step 4: Create Sync Script

Create the sync script sync-cloudflare-direct.py:

#!/usr/bin/env python3
"""Direct sync to Cloudflare D1 without heavy dependencies."""

import sqlite3
import json
import requests
import time
import argparse
import os
from datetime import datetime

def export_from_sqlite(db_path):
    """Export memories from local SQLite database."""
    memories = []
    
    try:
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()
        
        cursor.execute("""
            SELECT content, content_hash, tags, memory_type, metadata, 
                   created_at, created_at_iso, updated_at, updated_at_iso
            FROM memories 
            ORDER BY created_at DESC
        """)
        
        rows = cursor.fetchall()
        print(f"Found {len(rows)} memories to export")
        
        for row in rows:
            try:
                tags = json.loads(row[2]) if row[2] else []
            except (json.JSONDecodeError, TypeError):
                tags = [row[2]] if row[2] else []
            
            try:
                metadata = json.loads(row[4]) if row[4] else {}
            except (json.JSONDecodeError, TypeError):
                metadata = {}
            
            memory = {
                'content': row[0],
                'content_hash': row[1],
                'tags': tags,
                'memory_type': row[3],
                'metadata': metadata,
                'created_at': row[5],
                'created_at_iso': row[6],
                'updated_at': row[7],
                'updated_at_iso': row[8]
            }
            memories.append(memory)
        
        conn.close()
        return memories
        
    except Exception as e:
        print(f"Error exporting from SQLite: {e}")
        return []

def store_to_cloudflare(memories, api_token, account_id, d1_database_id):
    """Store memories to Cloudflare D1."""
    headers = {
        'Authorization': f'Bearer {api_token}',
        'Content-Type': 'application/json'
    }
    
    success_count = 0
    failed_count = 0
    
    # Initialize D1 schema
    init_sql = """
    CREATE TABLE IF NOT EXISTS memories (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        content_hash TEXT UNIQUE NOT NULL,
        content TEXT NOT NULL,
        tags TEXT,
        memory_type TEXT,
        metadata_json TEXT,
        created_at REAL,
        updated_at REAL,
        created_at_iso TEXT,
        updated_at_iso TEXT,
        content_size INTEGER DEFAULT 0,
        vector_id TEXT UNIQUE,
        r2_key TEXT
    );
    
    CREATE INDEX IF NOT EXISTS idx_created_at ON memories(created_at);
    CREATE INDEX IF NOT EXISTS idx_content_hash ON memories(content_hash);
    """
    
    url = f'https://api.cloudflare.com/client/v4/accounts/{account_id}/d1/database/{d1_database_id}/query'
    
    # Initialize schema
    try:
        response = requests.post(url, headers=headers, json={'sql': init_sql})
        if response.status_code == 200:
            print("D1 schema initialized")
        else:
            print(f"Schema init warning: {response.text}")
    except Exception as e:
        print(f"Error initializing schema: {e}")
        return False
    
    # Insert memories
    for i, memory in enumerate(memories):
        try:
            insert_sql = """
            INSERT OR REPLACE INTO memories 
            (content_hash, content, tags, memory_type, metadata_json, 
             created_at, updated_at, created_at_iso, updated_at_iso, content_size)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
            """
            
            params = [
                memory['content_hash'],
                memory['content'],
                json.dumps(memory.get('tags', [])),
                memory.get('memory_type'),
                json.dumps(memory.get('metadata', {})),
                memory.get('created_at'),
                memory.get('updated_at'),
                memory.get('created_at_iso'),
                memory.get('updated_at_iso'),
                len(memory['content']) if memory['content'] else 0
            ]
            
            response = requests.post(url, headers=headers, json={
                'sql': insert_sql,
                'params': params
            })
            
            if response.status_code == 200:
                success_count += 1
                if success_count % 10 == 0:
                    print(f"Progress: {success_count}/{len(memories)} uploaded")
            else:
                failed_count += 1
                print(f"Failed to store memory {memory['content_hash'][:16]}: {response.text}")
            
            # Rate limiting
            if i % 5 == 0:
                time.sleep(0.5)
                
        except Exception as e:
            failed_count += 1
            print(f"Error storing memory: {e}")
    
    print(f"Upload complete: {success_count} success, {failed_count} failed")
    return failed_count == 0

def main():
    parser = argparse.ArgumentParser(description='Sync to Cloudflare D1')
    parser.add_argument('--db-path', 
                       default='~/.local/share/mcp-memory/sqlite_vec.db',
                       help='Local SQLite database path')
    parser.add_argument('--config', 
                       default='.env.cloudflare',
                       help='Config file path')
    
    args = parser.parse_args()
    
    # Load configuration
    config_path = os.path.expanduser(args.config)
    if os.path.exists(config_path):
        with open(config_path) as f:
            for line in f:
                if '=' in line and not line.startswith('#'):
                    key, value = line.strip().split('=', 1)
                    os.environ[key] = value
    
    # Get required config
    api_token = os.getenv('CLOUDFLARE_API_TOKEN')
    account_id = os.getenv('CLOUDFLARE_ACCOUNT_ID')
    d1_database_id = os.getenv('CLOUDFLARE_D1_DATABASE_ID')
    
    if not all([api_token, account_id, d1_database_id]):
        print("Error: Missing required Cloudflare environment variables")
        print("Required: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID, CLOUDFLARE_D1_DATABASE_ID")
        return False
    
    # Export and sync
    db_path = os.path.expanduser(args.db_path)
    memories = export_from_sqlite(db_path)
    
    if memories:
        return store_to_cloudflare(memories, api_token, account_id, d1_database_id)
    else:
        print("No memories to sync")
        return False

if __name__ == '__main__':
    success = main()
    exit(0 if success else 1)

Step 5: Create Automated Sync Script

Create sync-to-cloudflare.sh:

#!/bin/bash
# Daily sync script for Cloudflare backup

cd /path/to/your/mcp-memory-service

# Load configuration
if [ -f ".env.cloudflare" ]; then
    export $(cat .env.cloudflare | grep -v '^#' | xargs)
else
    echo "Error: .env.cloudflare not found"
    exit 1
fi

# Log file
LOG_FILE="/var/log/memory-cloudflare-sync.log"

echo "$(date): Starting Cloudflare sync..." >> "$LOG_FILE"

# Run sync
python3 sync-cloudflare-direct.py --config .env.cloudflare >> "$LOG_FILE" 2>&1

if [ $? -eq 0 ]; then
    echo "$(date): Sync completed successfully" >> "$LOG_FILE"
else
    echo "$(date): Sync failed with error code $?" >> "$LOG_FILE"
fi

echo "$(date): Sync process finished" >> "$LOG_FILE"

Make executable:

chmod +x sync-to-cloudflare.sh

Step 6: Test the Setup

Test Single Memory

Create a test file test-memory.json:

{
  "memories": [
    {
      "content": "Test Cloudflare sync",
      "content_hash": "test123",
      "tags": ["test", "cloudflare"],
      "memory_type": "note",
      "metadata": {},
      "created_at": 1692806400.0,
      "created_at_iso": "2023-08-23T12:00:00Z",
      "updated_at": 1692806400.0,
      "updated_at_iso": "2023-08-23T12:00:00Z"
    }
  ]
}

Test:

python3 sync-cloudflare-direct.py --input test-memory.json

Test Full Sync

python3 sync-cloudflare-direct.py

Test Sync Script

./sync-to-cloudflare.sh

Step 7: Set Up Automated Schedule

Add to crontab for daily backup at 3 AM:

# Edit crontab
crontab -e

# Add this line (adjust path to your script)
0 3 * * * /path/to/your/sync-to-cloudflare.sh

Step 8: Monitor and Verify

Check Sync Logs

tail -f /var/log/memory-cloudflare-sync.log

Verify in Cloudflare Dashboard

Go to your D1 database in Cloudflare Dashboard
Use the "Console" tab to query: SELECT COUNT(*) FROM memories;

Manual Verification

# Check last sync status
grep "$(date '+%Y-%m-%d')" /var/log/memory-cloudflare-sync.log

Troubleshooting

Common Issues

1. Permission denied errors

# Fix log file permissions
sudo touch /var/log/memory-cloudflare-sync.log
sudo chown $USER:$USER /var/log/memory-cloudflare-sync.log

2. Environment variables not found

Verify .env.cloudflare exists and has correct format
Check file permissions: chmod 600 .env.cloudflare

3. Database connection errors

Verify API token permissions in Cloudflare Dashboard
Check Account ID and Database ID are correct

4. Schema errors

Run the test sync first to initialize the D1 schema
Check D1 database console for any existing conflicting tables

Performance Tips

Rate Limiting: The script includes built-in rate limiting (0.5s every 5 requests)
Batch Size: For very large databases (>1000 memories), consider implementing batching
Incremental Sync: For frequent syncs, implement timestamp-based incremental updates

Security Best Practices

API Token: Use minimum required permissions (D1:Edit only)
Environment File: Never commit .env.cloudflare to version control
Log Files: Ensure log files don't expose sensitive information
File Permissions: Set restrictive permissions on config files

chmod 600 .env.cloudflare
chmod 600 sync-cloudflare-direct.py
chmod 755 sync-to-cloudflare.sh

Benefits

✅ Global Access: Access memories from anywhere via Cloudflare's global network
✅ Disaster Recovery: Complete backup if local system fails
✅ Zero Downtime: Local service remains unaffected
✅ Cost Effective: Pay-per-use pricing, typically <$1/month
✅ Automatic: Set-and-forget daily synchronization

Next Steps

Remote Access: Set up a simple web interface to query Cloudflare D1 directly
Bi-directional Sync: Implement conflict resolution for two-way synchronization
Monitoring: Add alerting for failed sync attempts
Optimization: Implement incremental sync for large databases

Support

For issues with this setup:

Check the troubleshooting section above
Review logs in /var/log/memory-cloudflare-sync.log
Test individual components (export, API connection, schema)
Consult the MCP Memory Service documentation for backend-specific issues

Real-World Example

This guide was successfully tested with:

941 memories in local SQLite-vec database
Cloudflare D1 database with existing schema
Daily automated sync at 3 AM via cron
Zero downtime - local service unaffected during sync
~1 minute sync time for 941 memories with rate limiting

Architecture Notes

Why This Approach Works:

Local Primary: SQLite-vec provides fast local access (<5ms queries)
Cloud Backup: Cloudflare D1 provides global availability and disaster recovery
Hybrid Benefits: Best of both worlds - speed + reliability
Cost Effective: Minimal Cloudflare usage, typically <$1/month
Simple Maintenance: Single Python script, standard cron scheduling

Uh oh!

Cloudflare Backup Sync Setup

Cloudflare Backup Sync Setup Guide

Overview

Prerequisites

Step 1: Create Cloudflare Resources

Option A: Using Cloudflare Dashboard (Recommended)

Option B: Using Wrangler CLI

Step 2: Create API Token

Step 3: Configure Environment

Step 4: Create Sync Script

Step 5: Create Automated Sync Script

Step 6: Test the Setup

Test Single Memory

Test Full Sync

Test Sync Script

Step 7: Set Up Automated Schedule

Step 8: Monitor and Verify

Check Sync Logs

Verify in Cloudflare Dashboard

Manual Verification

Troubleshooting

Common Issues

Performance Tips

Security Best Practices

Benefits

Next Steps

Support

Real-World Example

Architecture Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally