FastAPI workbench for text embedding (Gemma-300m with Matryoshka) and summarization (Gemma/Gemini). Features hardware acceleration, caching, and secure endpoints for local LLM integration.
-
Updated
Oct 17, 2025 - Python
FastAPI workbench for text embedding (Gemma-300m with Matryoshka) and summarization (Gemma/Gemini). Features hardware acceleration, caching, and secure endpoints for local LLM integration.
Advanced Golang concepts 🔵
Add a description, image, and links to the rate-limiting-caching topic page so that developers can more easily learn about it.
To associate your repository with the rate-limiting-caching topic, visit your repo's landing page and select "manage topics."