Clipper

Clipper is a Python-based tool designed to extract video clips from .mp4 files based on keywords found in corresponding .srt subtitle files. It handles complex filenames with emojis and special characters, ensuring compatibility across Windows and Linux, and generates clips with embedded subtitles at 720P (configurable) resolution.

Features

Keyword-Based Clipping: Extracts segments where specified keywords appear in subtitles.
NLP based Speech Category Clipping: Uses NLP to find and clip one or more specific category of speech from in subtitles.
Multi-Keyword Filenames: Output filenames include all unique keywords in a clip’s subtitle range (e.g., big_copper).
Special Character Support: Safely processes filenames with emojis (e.g., 🔱🐈) and long formats.
Subtitle Embedding: Embeds adjusted subtitles into clips using FFmpeg.
Configurable Buffers: Adds pre- and post-buffers (default 5s) around matched subtitles.
Resolution Control: Scales clips to 720P, preserving aspect ratio.
Parallel Processing: Supports multi-threaded processing via a configurable thread pool.
Flexible Configuration: Uses a JSON config file for settings.

TODO

Add S2T (Speech-to-Text) subtitle creation for videos without subtitles
Add output resolution control to config file
Add output content control flags in config file; video, audio, subtitles, metadata
Add support for multiple subtitle formats
Add support for multiple video formats
Add support for multiple audio formats

Requirements

Python 3.6+
FFmpeg (installed and accessible in your PATH)
Required Python packages:
- srt
- tqdm
- Install via: pip install -r requirements.txt

Installation

Clone the Repository:

git clone https://github.com/mattladewig/clipper.git
cd clipper

Set Up a Virtual Environment (optional but recommended):

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```
Install FFmpeg:
- On Ubuntu: sudo apt install ffmpeg
- On macOS: brew install ffmpeg
- On Windows: Download from FFmpeg’s site and add to PATH.

Usage

Prepare Your Files:
- Place .mp4 videos and their corresponding .srt subtitle files (named identically, e.g., 1.mp4 and 1.srt) in the videos/ directory (or configure a custom directory in config.json).

Configure the Tool:

Edit config.json to specify keywords and settings (see Configuration below).

Example config.json:

{
  "directory": "videos",
  "output_dir": "clips",
  "keywords": ["big", "copper"],
  "word_alt_map": {},
  "pre_buffer": 5.0,
  "post_buffer": 5.0,
  "max_workers": 2,
  "use_subdirs": false,
  "logging": "INFO",
  "speech_categories": [ ],
}

Run the Script:
```
python clipper.py --config config.json
```
- Add --verbose for debug-level logging: python clipper.py --config config.json --verbose.
Output:
- Clips are saved in the clips/ directory (or as configured), named like videoID-keywords_clipNumber_startTime.mp4 (e.g., 2-big_copper_001_111.mp4).

Configuration

The config.json file supports the following options:

directory: Input directory for .mp4 and .srt files (default: "videos").
output_dir: Output directory for clipped videos (default: "clips").
keywords: List of keywords to search for in subtitles (e.g., ["big", "copper"]).
word_alt_map: Dictionary of keyword aliases (e.g., {"big": ["large", "huge"]}). Optional.
pre_buffer: Seconds added before each matched subtitle (default: 5.0).
post_buffer: Seconds added after each matched subtitle (default: 5.0).
max_workers: Number of threads for parallel processing (default: 1).
use_subdirs: If true, creates subdirectories per video ID in the output directory (default: false).
logging: Logging level ("DEBUG", "INFO", "WARNING", "ERROR", default: "INFO").
speech_categories: List of speech categories to search for in subtitles (e.g., ["narration", "dialogue"]). Optional, slow.

Example Output

Given:

videos/1.mp4 and videos/1.srt
videos/2.mp4 and videos/2.srt
Keywords: ["big", "copper"]

Running python clipper.py --config config.json might produce:

clips/1-big_001_11.mp4
clips/1-big_002_46.mp4
clips/1-big_003_156.mp4
clips/1-big_004_185.mp4
clips/1-copper_005_333.mp4
clips/2-big_copper_001_111.mp4

Each clip:

Is scaled to 720p.
Contains embedded subtitles adjusted to the clip’s timeline.
Has a filename reflecting all keywords in the clip’s subtitle range.

How It Works

Subtitle Parsing: Loads .srt files and searches for keywords (case-insensitive).
Range Merging: Combines overlapping subtitle ranges with buffers.
Clip Extraction: Uses FFmpeg to cut video segments and embed subtitles.
Naming: Generates filenames based on all keywords found in the clip’s full subtitle range (clip_subs_filtered).

Controls

q: Exits, tmp/ empties.
p: Pauses, logs "Pausing processing...".
r: Resumes, logs "Resuming processing...".

Debugging

Use --verbose to see detailed logs, or set logging to DEBUG in config file.
```
python clipper.py --config config.json --verbose
```
Logs include search targets, clip ranges, subtitle contents, and FFmpeg commands.

Contributing

Feel free to fork the repository, submit issues, or create pull requests on GitHub.

License

This project is open-source under the MIT License.

Acknowledgments

Built with Python, FFmpeg, Hugging Face Transformers, and the srt library.
Thanks to contributors and users for feedback!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.trunk		.trunk
.vscode		.vscode
tests		tests
.gitignore		.gitignore
README.md		README.md
clipper.py		clipper.py
config.py		config.py
output_handler.py		output_handler.py
requirements.txt		requirements.txt
subtitle_parser.py		subtitle_parser.py
video_processor.py		video_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clipper

Features

TODO

Requirements

Installation

Usage

Configuration

Example Output

How It Works

Controls

Debugging

Contributing

License

Acknowledgments

About

Releases 1

Packages

Languages

mattladewig/clipper

Folders and files

Latest commit

History

Repository files navigation

Clipper

Features

TODO

Requirements

Installation

Usage

Configuration

Example Output

How It Works

Controls

Debugging

Contributing

License

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages