A simple, API-key–only Python script that downloads every available transcript from a single YouTube channel and combines them into one text file for downstream NLP, search, or archival.
Author: Vladislav Virtonen Year: 2025
Feature | Details |
---|---|
No OAuth | Works with a lightweight YouTube Data API v3 key. |
Channel-wide | Iterates through all public videos on a channel. |
Titles included | Each transcript section is headed with the video title. |
Plain text output | Easy to index, grep, or feed into LLMs. |
Minimal dependencies | Only two pip installs. |
Need every caption or subtitle from a YouTube channel without copying them one by one?
This tool automates the entire process: point it at any channel, and it downloads all available transcripts at once—manual or auto-generated—ready for NLP, archives, or personal study.
├── youtube_transcript_downloader.py # Main script (fully documented)
└── README.md # You are here
git clone https://github.com/<yourusername>/youtube-transcript-scraper.git
cd youtube-transcript-scraper
pip install -r requirements.txt # or see “Dependencies” below
python youtube_transcript_downloader.py
After the script finishes you will find:
all_transcripts.txt
Each section looks like:
========== [Title] ==========
[transcript] ...
Item | Notes |
---|---|
Python ≥ 3.8 | Tested on 3.8–3.12 |
YouTube Data API v3 key | Free in Google Cloud Console |
Channel ID | Begins with UC… (see next section) |
- Open Google Cloud Console →
APIs & Services
→ New Project. - In API Library, enable YouTube Data API v3.
- Go to Credentials → Create Credentials → API Key.
- (Optional) Restrict it to “YouTube Data API v3” only.
Method A – Fast Online
: Paste https://www.youtube.com/@channel
into any online “YouTube Channel ID Finder”.
Method B – Manual (no third-party tools)
-
Open the channel page in Chrome/Firefox.
-
Right-click → View Page Source.
-
Press
Ctrl/Cmd + F
and search for"channelId":
-
Copy the value that looks like
UCxxxxxxxxxxxxxxxxxxxxxx
.
Open youtube_transcript_downloader.py
and fill in:
API_KEY = "YOUR_API_KEY_HERE"
CHANNEL_ID = "UCxxxxxxxxxxxxxxxxxxxxxx"
Then run:
python youtube_transcript_downloader.py
The script will:
- List every video on the channel (50 per page, auto-paginated).
- Fetch the transcript (manual or auto-generated) via youtube-transcript-api.
- Write them to
all_transcripts.txt
, with clear title headers.
pip install google-api-python-client youtube-transcript-api
(Alternatively use pip install -r requirements.txt
if you add one.)
Feel free to fork and PR:
- Add JSON/CSV export
- Push transcripts to Google Drive or S3
- Rate-limit handling for very large channels
- Language fallback chains
MIT License – see LICENSE
if added.
- google-api-python-client – official YouTube Data API wrapper.
- youtube-transcript-api – elegant caption fetcher by @jdepoix.
Happy scraping! 🎬📝