This document describes the data processing pipeline in SoccerTrack-V2, from raw input data to final ground truth files.
The complete data processing pipeline consists of these steps:
-
Video Preprocessing
- Trim videos into first and second halves
- Extract frame timing information
-
Coordinate Processing
- Convert raw XML tracking data to pitch coordinates
- Project pitch coordinates to image plane
- Generate both calibrated and distorted coordinates
-
Camera Calibration
- Generate calibration mappings from keypoints
- Apply calibration to videos
- Create calibrated panorama videos
-
Object Detection
- Run YOLOv8 detection on videos
- Process both calibrated and distorted versions
- Generate detection files in MOT format
-
Ground Truth Creation
- Convert coordinates to bounding boxes
- Generate visualization plots
- Create final ground truth MOT files
Process an entire match with a single command:
./scripts/create_ground_truth.sh <match_id>Example:
./scripts/create_ground_truth.sh 117093Process multiple matches:
./scripts/create_ground_truth.sh 117093 117094 117095If you need to run specific steps:
-
Video Preprocessing:
./scripts/trim_video_into_halves.sh 117093
-
Coordinate Processing:
# Convert XML to pitch plane ./scripts/convert_raw_to_pitch_plane.sh 117093 # Project to image plane ./scripts/convert_pitch_plane_to_image_plane.sh 117093
-
Camera Calibration:
# Generate mappings ./scripts/calibration/generate_calibration_mappings.sh 117093 # Apply calibration ./scripts/calibration/calibrate_camera.sh 117093
-
Object Detection:
./scripts/generate_detections.sh 117093
-
Ground Truth Creation:
./scripts/convert_coordinates_to_bboxes.sh 117093
-
Visualization:
# Generate videos ./scripts/plot_coordinates_on_video.sh 117093 # Or just first frames ./scripts/plot_coordinates_on_video.sh 117093 --first-frame-only
-
Raw Data Files:
data/raw/<match_id>/ ├── <match_id>_panorama.mp4 # Full match video ├── <match_id>_tracker_box_data.xml # Raw tracking data ├── <match_id>_tracker_box_metadata.xml # Metadata └── <match_id>_padding_info.csv # Frame timing info -
Configuration Files:
configs/ ├── default_config.yaml # Main configuration ├── tracker_config.yaml # YOLOv8 tracker settings └── video_trimming_config.yaml # Video processing settings
The pipeline generates files in data/interim/<match_id>/:
-
Videos:
<match_id>_panorama_[1st/2nd]_half.mp4<match_id>_calibrated_panorama_[1st/2nd]_half.mp4
-
Coordinates:
<match_id>_pitch_plane_coordinates_[1st/2nd]_half.csv<match_id>_image_plane_coordinates_[1st/2nd]_half_[calibrated/distorted].csv
-
Calibration:
<match_id>_camera_intrinsics.npz<match_id>_homography.npy
-
Detections:
<match_id>_detections_[1st/2nd]_half_[calibrated/distorted].csv
-
Ground Truth:
<match_id>_ground_truth_mot_[1st/2nd]_half_[calibrated/distorted].csv
-
Analysis:
<match_id>_[width/height]_correlation.png<match_id>_[width/height]_regression.png<match_id>_bbox_models.joblib
-
Visualizations:
<match_id>_plot_coordinates_[1st/2nd]_half_[calibrated/distorted].[jpg/mp4]
Key configuration files and their purposes:
-
configs/default_config.yaml:- Main configuration file
- Contains settings for all pipeline components
- Can be overridden via command line
-
configs/tracker_config.yaml:- YOLOv8 tracking parameters
- Detection thresholds
- Visualization settings
-
configs/video_trimming_config.yaml:- Video processing settings
- Frame padding configuration
- Output video settings
The pipeline includes comprehensive error checking:
-
Input Validation:
- Checks for required files
- Validates file formats
- Verifies data consistency
-
Process Monitoring:
- Tracks progress of each step
- Provides detailed error messages
- Allows for process resumption
-
Output Verification:
- Validates generated files
- Checks file integrity
- Ensures complete output set
-
Data Organization:
- Keep raw data in
data/raw/ - Store intermediate files in
data/interim/ - Use consistent naming conventions
- Keep raw data in
-
Configuration Management:
- Document config changes
- Use version control for configs
- Test config modifications
-
Error Recovery:
- Save intermediate results
- Use error logs for debugging
- Maintain backup copies
Common issues and solutions:
-
Missing Files:
- Check raw data presence
- Verify file permissions
- Ensure correct paths
-
Process Failures:
- Check error messages
- Verify configurations
- Ensure dependencies
-
Quality Issues:
- Validate input data
- Check calibration quality
- Verify detection settings