-
-
Notifications
You must be signed in to change notification settings - Fork 110
Cloudflare Backup Sync Setup
This guide shows how to set up automated daily backups of your local MCP Memory Service database to Cloudflare D1 for global access and disaster recovery.
Architecture: Keep your local SQLite-vec service as primary (fast) and sync to Cloudflare D1 as backup (global access).
- Cloudflare Account with Workers/D1 access
- Local MCP Memory Service running with SQLite-vec backend
- Python 3 (avoid uv environment to prevent unnecessary downloads)
-
Create D1 Database:
- Go to Cloudflare Dashboard → D1
- Click "Create Database"
- Name:
your-project-memory-backup
- Copy the Database ID from the overview page
-
Create Vectorize Index (Optional - for semantic search):
- Go to Cloudflare Dashboard → Vectorize
- Click "Create Index"
- Name:
your-project-memory-backup
- Dimensions:
768
- Metric:
cosine
-
Get Account ID:
- Go to Cloudflare Dashboard
- Copy Account ID from the right sidebar
# Install Wrangler
npm install -g wrangler
wrangler login
# Create resources
wrangler d1 create your-project-memory-backup
wrangler vectorize create your-project-memory-backup --dimensions=768 --metric=cosine
- Go to Cloudflare Dashboard → My Profile → API Tokens
- Click "Create Token" → "Custom Token"
- Configure permissions:
-
Account:
Read
-
D1:
Edit
-
Vectorize:
Edit
(if using) -
Workers AI:
Read
(if using)
-
Account:
- Save the token securely
Create .env.cloudflare
in your MCP Memory Service directory:
# Replace with your actual values
CLOUDFLARE_API_TOKEN=your-api-token-here
CLOUDFLARE_ACCOUNT_ID=your-account-id-here
CLOUDFLARE_D1_DATABASE_ID=your-d1-database-id-here
CLOUDFLARE_VECTORIZE_INDEX=your-vectorize-index-name
Security Note: Keep this file secure and never commit to version control.
Create the sync script sync-cloudflare-direct.py
:
#!/usr/bin/env python3
"""Direct sync to Cloudflare D1 without heavy dependencies."""
import sqlite3
import json
import requests
import time
import argparse
import os
from datetime import datetime
def export_from_sqlite(db_path):
"""Export memories from local SQLite database."""
memories = []
try:
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
cursor.execute("""
SELECT content, content_hash, tags, memory_type, metadata,
created_at, created_at_iso, updated_at, updated_at_iso
FROM memories
ORDER BY created_at DESC
""")
rows = cursor.fetchall()
print(f"Found {len(rows)} memories to export")
for row in rows:
try:
tags = json.loads(row[2]) if row[2] else []
except (json.JSONDecodeError, TypeError):
tags = [row[2]] if row[2] else []
try:
metadata = json.loads(row[4]) if row[4] else {}
except (json.JSONDecodeError, TypeError):
metadata = {}
memory = {
'content': row[0],
'content_hash': row[1],
'tags': tags,
'memory_type': row[3],
'metadata': metadata,
'created_at': row[5],
'created_at_iso': row[6],
'updated_at': row[7],
'updated_at_iso': row[8]
}
memories.append(memory)
conn.close()
return memories
except Exception as e:
print(f"Error exporting from SQLite: {e}")
return []
def store_to_cloudflare(memories, api_token, account_id, d1_database_id):
"""Store memories to Cloudflare D1."""
headers = {
'Authorization': f'Bearer {api_token}',
'Content-Type': 'application/json'
}
success_count = 0
failed_count = 0
# Initialize D1 schema
init_sql = """
CREATE TABLE IF NOT EXISTS memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
content_hash TEXT UNIQUE NOT NULL,
content TEXT NOT NULL,
tags TEXT,
memory_type TEXT,
metadata_json TEXT,
created_at REAL,
updated_at REAL,
created_at_iso TEXT,
updated_at_iso TEXT,
content_size INTEGER DEFAULT 0,
vector_id TEXT UNIQUE,
r2_key TEXT
);
CREATE INDEX IF NOT EXISTS idx_created_at ON memories(created_at);
CREATE INDEX IF NOT EXISTS idx_content_hash ON memories(content_hash);
"""
url = f'https://api.cloudflare.com/client/v4/accounts/{account_id}/d1/database/{d1_database_id}/query'
# Initialize schema
try:
response = requests.post(url, headers=headers, json={'sql': init_sql})
if response.status_code == 200:
print("D1 schema initialized")
else:
print(f"Schema init warning: {response.text}")
except Exception as e:
print(f"Error initializing schema: {e}")
return False
# Insert memories
for i, memory in enumerate(memories):
try:
insert_sql = """
INSERT OR REPLACE INTO memories
(content_hash, content, tags, memory_type, metadata_json,
created_at, updated_at, created_at_iso, updated_at_iso, content_size)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
"""
params = [
memory['content_hash'],
memory['content'],
json.dumps(memory.get('tags', [])),
memory.get('memory_type'),
json.dumps(memory.get('metadata', {})),
memory.get('created_at'),
memory.get('updated_at'),
memory.get('created_at_iso'),
memory.get('updated_at_iso'),
len(memory['content']) if memory['content'] else 0
]
response = requests.post(url, headers=headers, json={
'sql': insert_sql,
'params': params
})
if response.status_code == 200:
success_count += 1
if success_count % 10 == 0:
print(f"Progress: {success_count}/{len(memories)} uploaded")
else:
failed_count += 1
print(f"Failed to store memory {memory['content_hash'][:16]}: {response.text}")
# Rate limiting
if i % 5 == 0:
time.sleep(0.5)
except Exception as e:
failed_count += 1
print(f"Error storing memory: {e}")
print(f"Upload complete: {success_count} success, {failed_count} failed")
return failed_count == 0
def main():
parser = argparse.ArgumentParser(description='Sync to Cloudflare D1')
parser.add_argument('--db-path',
default='~/.local/share/mcp-memory/sqlite_vec.db',
help='Local SQLite database path')
parser.add_argument('--config',
default='.env.cloudflare',
help='Config file path')
args = parser.parse_args()
# Load configuration
config_path = os.path.expanduser(args.config)
if os.path.exists(config_path):
with open(config_path) as f:
for line in f:
if '=' in line and not line.startswith('#'):
key, value = line.strip().split('=', 1)
os.environ[key] = value
# Get required config
api_token = os.getenv('CLOUDFLARE_API_TOKEN')
account_id = os.getenv('CLOUDFLARE_ACCOUNT_ID')
d1_database_id = os.getenv('CLOUDFLARE_D1_DATABASE_ID')
if not all([api_token, account_id, d1_database_id]):
print("Error: Missing required Cloudflare environment variables")
print("Required: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID, CLOUDFLARE_D1_DATABASE_ID")
return False
# Export and sync
db_path = os.path.expanduser(args.db_path)
memories = export_from_sqlite(db_path)
if memories:
return store_to_cloudflare(memories, api_token, account_id, d1_database_id)
else:
print("No memories to sync")
return False
if __name__ == '__main__':
success = main()
exit(0 if success else 1)
Create sync-to-cloudflare.sh
:
#!/bin/bash
# Daily sync script for Cloudflare backup
cd /path/to/your/mcp-memory-service
# Load configuration
if [ -f ".env.cloudflare" ]; then
export $(cat .env.cloudflare | grep -v '^#' | xargs)
else
echo "Error: .env.cloudflare not found"
exit 1
fi
# Log file
LOG_FILE="/var/log/memory-cloudflare-sync.log"
echo "$(date): Starting Cloudflare sync..." >> "$LOG_FILE"
# Run sync
python3 sync-cloudflare-direct.py --config .env.cloudflare >> "$LOG_FILE" 2>&1
if [ $? -eq 0 ]; then
echo "$(date): Sync completed successfully" >> "$LOG_FILE"
else
echo "$(date): Sync failed with error code $?" >> "$LOG_FILE"
fi
echo "$(date): Sync process finished" >> "$LOG_FILE"
Make executable:
chmod +x sync-to-cloudflare.sh
Create a test file test-memory.json
:
{
"memories": [
{
"content": "Test Cloudflare sync",
"content_hash": "test123",
"tags": ["test", "cloudflare"],
"memory_type": "note",
"metadata": {},
"created_at": 1692806400.0,
"created_at_iso": "2023-08-23T12:00:00Z",
"updated_at": 1692806400.0,
"updated_at_iso": "2023-08-23T12:00:00Z"
}
]
}
Test:
python3 sync-cloudflare-direct.py --input test-memory.json
python3 sync-cloudflare-direct.py
./sync-to-cloudflare.sh
Add to crontab for daily backup at 3 AM:
# Edit crontab
crontab -e
# Add this line (adjust path to your script)
0 3 * * * /path/to/your/sync-to-cloudflare.sh
tail -f /var/log/memory-cloudflare-sync.log
- Go to your D1 database in Cloudflare Dashboard
- Use the "Console" tab to query:
SELECT COUNT(*) FROM memories;
# Check last sync status
grep "$(date '+%Y-%m-%d')" /var/log/memory-cloudflare-sync.log
1. Permission denied errors
# Fix log file permissions
sudo touch /var/log/memory-cloudflare-sync.log
sudo chown $USER:$USER /var/log/memory-cloudflare-sync.log
2. Environment variables not found
- Verify
.env.cloudflare
exists and has correct format - Check file permissions:
chmod 600 .env.cloudflare
3. Database connection errors
- Verify API token permissions in Cloudflare Dashboard
- Check Account ID and Database ID are correct
4. Schema errors
- Run the test sync first to initialize the D1 schema
- Check D1 database console for any existing conflicting tables
- Rate Limiting: The script includes built-in rate limiting (0.5s every 5 requests)
- Batch Size: For very large databases (>1000 memories), consider implementing batching
- Incremental Sync: For frequent syncs, implement timestamp-based incremental updates
- API Token: Use minimum required permissions (D1:Edit only)
-
Environment File: Never commit
.env.cloudflare
to version control - Log Files: Ensure log files don't expose sensitive information
- File Permissions: Set restrictive permissions on config files
chmod 600 .env.cloudflare
chmod 600 sync-cloudflare-direct.py
chmod 755 sync-to-cloudflare.sh
✅ Global Access: Access memories from anywhere via Cloudflare's global network
✅ Disaster Recovery: Complete backup if local system fails
✅ Zero Downtime: Local service remains unaffected
✅ Cost Effective: Pay-per-use pricing, typically <$1/month
✅ Automatic: Set-and-forget daily synchronization
- Remote Access: Set up a simple web interface to query Cloudflare D1 directly
- Bi-directional Sync: Implement conflict resolution for two-way synchronization
- Monitoring: Add alerting for failed sync attempts
- Optimization: Implement incremental sync for large databases
For issues with this setup:
- Check the troubleshooting section above
- Review logs in
/var/log/memory-cloudflare-sync.log
- Test individual components (export, API connection, schema)
- Consult the MCP Memory Service documentation for backend-specific issues
This guide was successfully tested with:
- 941 memories in local SQLite-vec database
- Cloudflare D1 database with existing schema
- Daily automated sync at 3 AM via cron
- Zero downtime - local service unaffected during sync
- ~1 minute sync time for 941 memories with rate limiting
Why This Approach Works:
- Local Primary: SQLite-vec provides fast local access (<5ms queries)
- Cloud Backup: Cloudflare D1 provides global availability and disaster recovery
- Hybrid Benefits: Best of both worlds - speed + reliability
- Cost Effective: Minimal Cloudflare usage, typically <$1/month
- Simple Maintenance: Single Python script, standard cron scheduling