|  | 
|  | 1 | +# Improved Caching Strategy for OpenSensor API | 
|  | 2 | + | 
|  | 3 | +## Problem Analysis | 
|  | 4 | + | 
|  | 5 | +The original caching implementation had a fundamental flaw: it was caching function results that included timestamp parameters in the cache key. This meant: | 
|  | 6 | + | 
|  | 7 | +```python | 
|  | 8 | +# OLD: Cache key included ALL parameters including timestamps | 
|  | 9 | +@redis_cache(ttl_seconds=300) | 
|  | 10 | +def get_device_info_cached(device_id: str): | 
|  | 11 | +    # Cache key: opensensor:get_device_info_cached:md5(device_id + start_date + end_date + resolution) | 
|  | 12 | +``` | 
|  | 13 | + | 
|  | 14 | +**Issues:** | 
|  | 15 | +- Different time ranges = different cache keys | 
|  | 16 | +- Different resolutions = different cache keys | 
|  | 17 | +- Cache hit rate was essentially 0% for time-series data | 
|  | 18 | +- Wasted Redis memory with duplicate device lookups | 
|  | 19 | + | 
|  | 20 | +## Solution: Granular Caching Strategy | 
|  | 21 | + | 
|  | 22 | +### 1. **Device Metadata Caching** (Long TTL - 24 hours) | 
|  | 23 | +```python | 
|  | 24 | +# Cache device info separately from query parameters | 
|  | 25 | +cache_key = f"opensensor:device_meta:{device_id}" | 
|  | 26 | +# TTL: 24 hours (device metadata rarely changes) | 
|  | 27 | +``` | 
|  | 28 | + | 
|  | 29 | +### 2. **Pipeline Result Caching** (Medium TTL - 15 minutes) | 
|  | 30 | +```python | 
|  | 31 | +# Cache MongoDB aggregation results by pipeline hash | 
|  | 32 | +pipeline_hash = md5(core_pipeline_without_pagination) | 
|  | 33 | +cache_key = f"opensensor:pipeline:{pipeline_hash}" | 
|  | 34 | +# TTL: 15 minutes (good balance for time-series data) | 
|  | 35 | +``` | 
|  | 36 | + | 
|  | 37 | +### 3. **Aggregated Data Chunks** (Short TTL - 30 minutes) | 
|  | 38 | +```python | 
|  | 39 | +# Cache pre-aggregated data by time buckets | 
|  | 40 | +cache_key = f"opensensor:agg:{data_type}:{device_id}:{time_bucket}:{resolution}" | 
|  | 41 | +# TTL: 30 minutes (for frequently accessed data ranges) | 
|  | 42 | +``` | 
|  | 43 | + | 
|  | 44 | +## Key Improvements | 
|  | 45 | + | 
|  | 46 | +### ✅ **Cache Key Independence** | 
|  | 47 | +- Device metadata cached independently of query parameters | 
|  | 48 | +- Pipeline results cached by content hash, not input parameters | 
|  | 49 | +- Time-based bucketing for aggregated data | 
|  | 50 | + | 
|  | 51 | +### ✅ **Intelligent TTL Strategy** | 
|  | 52 | +- **Device metadata**: 24 hours (rarely changes) | 
|  | 53 | +- **Pipeline results**: 15 minutes (balance freshness vs performance) | 
|  | 54 | +- **Aggregated chunks**: 30 minutes (frequently accessed ranges) | 
|  | 55 | + | 
|  | 56 | +### ✅ **Smart Cache Invalidation** | 
|  | 57 | +```python | 
|  | 58 | +# Invalidate relevant caches when new data arrives | 
|  | 59 | +def _record_data_to_ts_collection(collection, environment, user=None): | 
|  | 60 | +    # ... insert data ... | 
|  | 61 | +    device_id = environment.device_metadata.device_id | 
|  | 62 | +    sensor_cache.invalidate_device_cache(device_id) | 
|  | 63 | +``` | 
|  | 64 | + | 
|  | 65 | +### ✅ **Size-Aware Caching** | 
|  | 66 | +```python | 
|  | 67 | +# Don't cache results larger than 1MB | 
|  | 68 | +if result and len(json.dumps(result, default=str)) < 1024 * 1024: | 
|  | 69 | +    sensor_cache.cache_pipeline_result(pipeline_hash, result, ttl_minutes) | 
|  | 70 | +``` | 
|  | 71 | + | 
|  | 72 | +### ✅ **Graceful Degradation** | 
|  | 73 | +- Falls back to direct database queries when Redis unavailable | 
|  | 74 | +- No service disruption if caching fails | 
|  | 75 | + | 
|  | 76 | +## Implementation Details | 
|  | 77 | + | 
|  | 78 | +### Cache-Aware Device Lookup | 
|  | 79 | +```python | 
|  | 80 | +def cache_aware_device_lookup(device_id: str) -> Tuple[List[str], str]: | 
|  | 81 | +    # Try cache first | 
|  | 82 | +    cached_metadata = sensor_cache.get_device_metadata(device_id) | 
|  | 83 | +    if cached_metadata: | 
|  | 84 | +        return cached_metadata['device_ids'], cached_metadata['device_name'] | 
|  | 85 | + | 
|  | 86 | +    # Cache miss - fetch and cache | 
|  | 87 | +    api_keys, _ = get_api_keys_by_device_id(device_id) | 
|  | 88 | +    device_ids, device_name = reduce_api_keys_to_device_ids(api_keys, device_id) | 
|  | 89 | + | 
|  | 90 | +    metadata = { | 
|  | 91 | +        'device_ids': device_ids, | 
|  | 92 | +        'device_name': device_name, | 
|  | 93 | +        'cached_at': datetime.utcnow().isoformat() | 
|  | 94 | +    } | 
|  | 95 | +    sensor_cache.cache_device_metadata(device_id, metadata) | 
|  | 96 | +    return device_ids, device_name | 
|  | 97 | +``` | 
|  | 98 | + | 
|  | 99 | +### Cache-Aware Aggregation | 
|  | 100 | +```python | 
|  | 101 | +def cache_aware_aggregation(collection, pipeline: List[dict], cache_ttl_minutes: int = 15) -> List[dict]: | 
|  | 102 | +    # Generate hash excluding pagination | 
|  | 103 | +    pipeline_hash = sensor_cache.generate_pipeline_hash(pipeline) | 
|  | 104 | + | 
|  | 105 | +    # Try cache | 
|  | 106 | +    cached_result = sensor_cache.get_pipeline_result(pipeline_hash) | 
|  | 107 | +    if cached_result is not None: | 
|  | 108 | +        return cached_result | 
|  | 109 | + | 
|  | 110 | +    # Execute and cache | 
|  | 111 | +    result = list(collection.aggregate(pipeline)) | 
|  | 112 | +    if result and len(json.dumps(result, default=str)) < 1024 * 1024: | 
|  | 113 | +        sensor_cache.cache_pipeline_result(pipeline_hash, result, cache_ttl_minutes) | 
|  | 114 | + | 
|  | 115 | +    return result | 
|  | 116 | +``` | 
|  | 117 | + | 
|  | 118 | +## Performance Benefits | 
|  | 119 | + | 
|  | 120 | +### Expected Improvements: | 
|  | 121 | +- **Cache Hit Rate**: 80-90% for device lookups (vs ~0% before) | 
|  | 122 | +- **Response Time**: 10-50ms improvement for cached requests | 
|  | 123 | +- **Database Load**: Significant reduction in MongoDB aggregation queries | 
|  | 124 | +- **Memory Efficiency**: No duplicate device metadata across cache entries | 
|  | 125 | + | 
|  | 126 | +### Cache Effectiveness by Use Case: | 
|  | 127 | + | 
|  | 128 | +1. **Dashboard Loading**: High hit rate for device metadata | 
|  | 129 | +2. **Time Series Charts**: Medium hit rate for common time ranges | 
|  | 130 | +3. **Real-time Updates**: Smart invalidation ensures data freshness | 
|  | 131 | +4. **API Pagination**: Same core data cached across pages | 
|  | 132 | + | 
|  | 133 | +## Migration Strategy | 
|  | 134 | + | 
|  | 135 | +### Phase 1: ✅ **Implement New Strategy** | 
|  | 136 | +- [x] Create `cache_strategy.py` with improved caching logic | 
|  | 137 | +- [x] Update `collection_apis.py` to use new caching functions | 
|  | 138 | +- [x] Add cache invalidation to data recording functions | 
|  | 139 | +- [x] Create comprehensive tests | 
|  | 140 | + | 
|  | 141 | +### Phase 2: **Deploy and Monitor** | 
|  | 142 | +- [ ] Deploy to staging environment | 
|  | 143 | +- [ ] Monitor cache hit rates via `/cache/stats` endpoint | 
|  | 144 | +- [ ] Verify performance improvements | 
|  | 145 | +- [ ] Monitor Redis memory usage | 
|  | 146 | + | 
|  | 147 | +### Phase 3: **Optimize and Scale** | 
|  | 148 | +- [ ] Fine-tune TTL values based on usage patterns | 
|  | 149 | +- [ ] Add more granular cache invalidation | 
|  | 150 | +- [ ] Consider implementing cache warming for popular devices | 
|  | 151 | + | 
|  | 152 | +## Monitoring and Debugging | 
|  | 153 | + | 
|  | 154 | +### Cache Statistics Endpoint | 
|  | 155 | +```bash | 
|  | 156 | +GET /cache/stats | 
|  | 157 | +``` | 
|  | 158 | + | 
|  | 159 | +Returns: | 
|  | 160 | +```json | 
|  | 161 | +{ | 
|  | 162 | +  "status": "connected", | 
|  | 163 | +  "opensensor_keys": 1250, | 
|  | 164 | +  "redis_version": "6.2.0", | 
|  | 165 | +  "used_memory": "15.2M", | 
|  | 166 | +  "keyspace_hits": 8420, | 
|  | 167 | +  "keyspace_misses": 1580, | 
|  | 168 | +  "hit_rate": "84.2%" | 
|  | 169 | +} | 
|  | 170 | +``` | 
|  | 171 | + | 
|  | 172 | +### Cache Management | 
|  | 173 | +```bash | 
|  | 174 | +# Clear all cache | 
|  | 175 | +POST /cache/clear | 
|  | 176 | + | 
|  | 177 | +# Invalidate specific patterns | 
|  | 178 | +POST /cache/invalidate | 
|  | 179 | +{"pattern": "device_meta:*"} | 
|  | 180 | +``` | 
|  | 181 | + | 
|  | 182 | +### Debugging Cache Behavior | 
|  | 183 | +```python | 
|  | 184 | +# Enable debug logging | 
|  | 185 | +import logging | 
|  | 186 | +logging.getLogger('opensensor.cache_strategy').setLevel(logging.DEBUG) | 
|  | 187 | +``` | 
|  | 188 | + | 
|  | 189 | +## Backward Compatibility | 
|  | 190 | + | 
|  | 191 | +- ✅ **Zero Breaking Changes**: All existing API endpoints work unchanged | 
|  | 192 | +- ✅ **Graceful Fallback**: Works without Redis (falls back to direct DB queries) | 
|  | 193 | +- ✅ **Incremental Adoption**: Can be deployed alongside existing caching | 
|  | 194 | + | 
|  | 195 | +## Future Enhancements | 
|  | 196 | + | 
|  | 197 | +### Potential Optimizations: | 
|  | 198 | +1. **Cache Warming**: Pre-populate cache for popular devices | 
|  | 199 | +2. **Compression**: Compress large cached results | 
|  | 200 | +3. **Distributed Caching**: Shard cache across multiple Redis instances | 
|  | 201 | +4. **Predictive Caching**: Cache likely-to-be-requested time ranges | 
|  | 202 | +5. **Cache Analytics**: Track which data types benefit most from caching | 
|  | 203 | + | 
|  | 204 | +### Advanced Features: | 
|  | 205 | +1. **Time-based Expiration**: Expire cache entries based on data age | 
|  | 206 | +2. **Smart Prefetching**: Prefetch adjacent time ranges | 
|  | 207 | +3. **Cache Hierarchy**: Multi-level caching (Redis + in-memory) | 
|  | 208 | +4. **Query Optimization**: Cache intermediate aggregation steps | 
|  | 209 | + | 
|  | 210 | +## Conclusion | 
|  | 211 | + | 
|  | 212 | +This improved caching strategy addresses the fundamental issue with timestamp-dependent cache keys by: | 
|  | 213 | + | 
|  | 214 | +1. **Separating concerns**: Device metadata vs query results | 
|  | 215 | +2. **Using content-based hashing**: Pipeline results cached by content, not parameters | 
|  | 216 | +3. **Implementing intelligent TTLs**: Different expiration times for different data types | 
|  | 217 | +4. **Adding smart invalidation**: Cache cleared when new data arrives | 
|  | 218 | + | 
|  | 219 | +The result is a much more effective caching system that should significantly improve API performance while maintaining data freshness and consistency. | 
0 commit comments