Transform long-form videos into engaging TikTok clips with AI-powered quote extraction and speaker identification.
QuoteMiner is an intelligent video processing pipeline that automatically extracts motivational quotes from videos and creates perfectly formatted TikTok clips with speaker diarization, face detection, and optional background video overlays.
- Claude AI Integration: Uses Anthropic's Claude to identify engaging, motivational moments
- Smart Duration Filtering: Automatically selects clips between 20-90 seconds
- Context-Aware Selection: Identifies standalone, emotionally engaging content perfect for social media
- Speaker Diarization: Identifies who is speaking using pyannote-audio
- Face Detection: OpenCV-powered face tracking for optimal cropping
- TikTok Format: Automatic conversion to 9:16 aspect ratio (1080x1920)
- Background Video Support: Optional background video overlay with smart audio mixing
- Multiprocessing: Up to 4 parallel workers for fast processing
- Memory Efficient: Processes small clips instead of entire videos
- Mini PC Friendly: Optimized for systems with 8GB RAM
- Hardware Adaptive: Automatic GPU/CPU detection and fallback
- High-Quality Videos: 1080p TikTok-ready clips
- Smart Audio Mixing: Preserves voice clarity with 15% background audio
- Intelligent Naming: Files named with speaker ID and content preview
- Automatic Cleanup: Temporary files managed automatically
- Clone the repository:
git clone https://github.com/yourusername/QuoteMiner.git
cd QuoteMiner
- Install dependencies:
uv sync
- Set up environment variables:
Create a .env
file in the project root:
ANTHROPIC_API_KEY=your_anthropic_api_key_here
HUGGINGFACE_TOKEN=your_huggingface_token_here
cd src
uv run python main.py --video-path /path/to/your/video.mp4
uv run python main.py --input-dir /path/to/video/directory/
uv run python main.py --video-path video.mp4 --background-video background.mp4
uv run python main.py --video-link "https://youtube.com/watch?v=VIDEO_ID"
QuoteMiner/
βββ src/
β βββ config/
β β βββ __init__.py
β β βββ settings.py # Configuration settings
β βββ models/
β β βββ __init__.py
β β βββ quote.py # Quote data model
β β βββ video_models.py # Video processing models
β βββ utils/
β β βββ __init__.py
β β βββ logger.py # Logging utilities
β β βββ quote_extraction.py # AI quote extraction
β β βββ transcription.py # Audio transcription
β β βββ video_cropper.py # Video processing pipeline
β β βββ video_downloader.py # YouTube video downloads
β βββ main.py # Main application entry point
βββ data/
β βββ videos/ # Input videos
β βββ quotes/ # Extracted quotes (JSON)
β βββ tiktok_clips/ # Output TikTok videos
β βββ transcriptions/ # Audio transcriptions
βββ .env # Environment variables
βββ pyproject.toml # Project dependencies
βββ README.md
For more control over the video processing pipeline:
cd src
uv run python -m utils.video_cropper \
/path/to/video.mp4 \
/path/to/quotes.json \
/path/to/output/ \
--background_video /path/to/background.mp4
Edit src/config/settings.py
to customize:
- Hardware Settings: GPU/CPU preferences, memory constraints
- Model Selection: Whisper model size for transcription
- Processing Limits: Maximum workers, file paths
- Quality Settings: Video resolution, audio quality
The system expects quotes in JSON format:
[
{
"start": "49.44",
"content": "This is a motivational quote from the video.",
"end": "58.24"
}
]
- Long-form podcast or interview video
- YouTube motivational content
- Educational videos with engaging moments
- Professional TikTok-format clips (9:16 aspect ratio)
- Speaker-focused cropping with face detection
- Clear audio with optional background music
- Filename format:
quote_1_SPEAKER_00_motivational_content.mp4
- CPU: Multi-core processor (4+ cores recommended)
- RAM: 8GB (16GB recommended for batch processing)
- Storage: 10GB free space for processing
- GPU: Optional (CPU fallback available)
- CUDA Support: Automatic GPU acceleration when available
- Memory Management: Efficient processing for resource-constrained systems
- Parallel Processing: Scales with available CPU cores
"No HuggingFace token provided"
- Ensure
HUGGINGFACE_TOKEN
is set in your.env
file - Get a token from Hugging Face
"ANTHROPIC_API_KEY environment variable not set"
- Add your Anthropic API key to the
.env
file - Get an API key from Anthropic Console
Memory Issues
- Set
CONSTRAINT = True
insrc/config/settings.py
- Reduce
MAX_WORKERS
to 2 or 1 - Use smaller Whisper models (base.en instead of large-v2)
Video Processing Errors
- Ensure input videos are in MP4 format
- Check that OpenCV can access your video files
- Verify sufficient disk space for processing
Enable detailed logging by modifying the logger level in src/main.py
:
logger = Logger(name="QuoteMiner", filename="QuoteMiner.log", level=logging.DEBUG)
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes and test thoroughly
- Commit with descriptive messages:
git commit -m 'feat: add amazing feature'
- Push to your branch:
git push origin feature/amazing-feature
- Open a Pull Request
# Install development dependencies
uv sync --all-extras
# Run tests (when available)
uv run pytest
# Format code
uv run black src/
# Type checking
uv run mypy src/
This project is licensed under the AGPL V3 License - see the LICENSE file for details.
- Anthropic - Claude AI for intelligent quote extraction
- Hugging Face - pyannote-audio for speaker diarization
- OpenAI - Whisper models for transcription
- MoviePy - Video processing capabilities
- OpenCV - Computer vision and face detection
- Documentation: [Coming Soon]
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with β€οΈ for content creators who want to transform long-form content into engaging short clips.
QuoteMiner - Where long videos become viral moments. β¨