This script uses the YouTube Data API to find all videos from a specific channel and then downloads the transcript for each video.
- Fetches all video IDs from a given YouTube channel.
- Downloads transcripts for each video.
- Saves each transcript as a separate
.mdfile in atranscriptsdirectory. - Compiles all transcripts into a single
master_transcript.mdfile, with video titles and URLs.
-
Clone the repository:
git clone <repository-url> cd youtube-scraper
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Set up your API Key:
- Create a file named
.envin the root of the project. - Add your YouTube Data API key to the
.envfile like this:YOUTUBE_API_KEY="YOUR_API_KEY_HERE" - You can get a YouTube API key from the Google Cloud Console.
- Create a file named
Run the script from your terminal:
python scraper.pyThe script will create a transcripts directory and a master_transcript.md file in your project folder.