Skip to content

mrdavtan/YT_Transcribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

YT_Transcribe

symb

YouTube Video Transcription CLI Tool

Description

The YouTube Video Transcription CLI Tool is a Python script that allows you to easily transcribe YouTube videos and save the transcriptions as formatted text files. It utilizes yt-dlp for video downloading, the OpenAI whisper library for transcription, and includes features for text processing and formatting.

The tool supports both single video transcription and batch processing of multiple videos from a list of URLs.

Features

  • Transcribe single or multiple YouTube videos using Whisper's speech recognition
  • Download videos efficiently using yt-dlp
  • Preserve natural speech segments with accurate timestamps
  • Automatically format and capitalize text for improved readability
  • Generate organized output with video title and timestamps
  • Save transcriptions in a structured directory format
  • Process multiple videos from a text file

Installation

  1. Clone the repository:
git clone https://github.com/your-username/youtube-transcription-tool.git
  1. Navigate to the project directory:
cd youtube-transcription-tool
  1. Create a Python virtual environment:
python -m venv .venv
  1. Activate the virtual environment:
source .venv/bin/activate  # On Unix/macOS
.venv\Scripts\activate     # On Windows
  1. Install the required dependencies:
pip install -r requirements.txt

Usage

Single Video Transcription

To transcribe a single YouTube video:

python whisper_transcribe_videos.py -u "https://youtube.com/watch?v=..."
# or
python whisper_transcribe_videos.py --url "https://youtube.com/watch?v=..."

Multiple Videos Transcription

To transcribe multiple videos, create a text file with one YouTube URL per line, then:

python whisper_transcribe_video.py -f urls.txt
# or
python whisper_transcribe_video.py --file urls.txt

Example urls.txt format:

https://www.youtube.com/watch?v=video1
https://www.youtube.com/watch?v=video2
https://www.youtube.com/watch?v=video3

Output

The script will:

  1. Create a videos directory for downloaded videos
  2. Create a transcriptions directory for output files
  3. Generate transcription files with the format: {video_title}_{date}.txt
  4. Include timestamps and properly formatted text in the transcription

Output Format

The transcription file will contain:

Video Title

[00:00:00.000] First segment of speech...

[00:00:04.123] Next segment of speech...

[00:00:08.456] And so on...

License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages