Add tutorial for parallel decoding #778

NicolasHug · 2025-07-13T15:59:14Z

No description provided.

examples/decoding/parallel_decoding.py

scotts · 2025-07-14T13:49:03Z

examples/decoding/parallel_decoding.py

+# Frame sampling strategy
+# -----------------------
+#
+# For this tutorial, we'll sample frames at a target rate of 2 FPS from our long


This might be a me-thing, but I was tripped up by the "2 FPS from our long video" part. That is, I initially thought that meant we were targeting the decoding itself to happen at the observable speed of 2 FPS. This might be because I know our benchmarks report FPS as their metric. Rather, we mean that we want to sample 2 FPS in the reference frame of the video's time. Phrasing that I think would have helped me understand quicker:

For this tutorial, we'll sample a frame every 2 seconds from our long video.

Also, just realized that in the existing text, "inference" is mispelled.

Fair, fps is overloaded and that confuses me too, sometimes. I'll add comments to clarify the context

scotts · 2025-07-14T13:51:49Z

examples/decoding/parallel_decoding.py

+#
+# Process-based parallelism distributes work across multiple Python processes.
+
+def decode_with_multiprocessing(indices: List[int], num_processes: int, video_path=long_video_path):


Let's put each of these parameters on a separate line - on my system, the rendering for this line wraps.

scotts · 2025-07-14T13:52:28Z

examples/decoding/parallel_decoding.py

+    """Decode frames using multiple processes with joblib."""
+    chunks = split_indices(indices, num_chunks=num_processes)
+
+    results = Parallel(n_jobs=num_processes, backend="loky", verbose=0)(


What's the "loky" backend?

It's a multi-processing backend for joblib: https://github.com/joblib/loky

I added a comment. In general, I don't want to go too much over the details of joblib in this tutorial, because as I mentioned at the top, the concepts covered here are joblib-agnostic. The reader should just trust that the decode_with* functions are doing the right thing

scotts · 2025-07-14T13:53:07Z

examples/decoding/parallel_decoding.py

+# Thread-based parallelism uses multiple threads within a single process.
+# TorchCodec releases the GIL, so this can be very effective.
+
+def decode_with_multithreading(indices: List[int], num_threads: int, video_path=long_video_path):


Same as above - let's use line breaks for parameters.

scotts · 2025-07-14T13:54:00Z

examples/decoding/parallel_decoding.py

+    """Decode frames using multiple threads with joblib."""
+    chunks = split_indices(indices, num_chunks=num_threads)
+
+    results = Parallel(n_jobs=num_threads, prefer="threads", verbose=0)(


Does "prefer" mean that in some situations it might end up using processes? Does it default to processes, since we didn't say this above?

It basically mean "use threads, unless this gets overridden by something with higher priority, like a context manager". There is more about this in the docstring for "backend" https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html

scotts

Excellent tutorial! I think this will be extremely helpful for our users!

Dan-Flores · 2025-07-14T14:42:45Z

examples/decoding/parallel_decoding.py

+times, result_ffmpeg = bench(decode_with_ffmpeg_parallelism, all_indices, num_threads=NUM_CPUS)
+ffmpeg_time = report_stats(times, unit="s")
+speedup = sequential_time / ffmpeg_time
+print(f"Speedup vs sequential: {speedup:.2f}x with {NUM_CPUS} FFmpeg threads.")


nit: Perhaps for additional clarity, we could write out the comparison instead of using "vs":

print(f"Speedup compared to sequential: {speedup:.2f}x using {NUM_CPUS} FFmpeg threads.")

Add tutorial for parallel decoding

21df977

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 13, 2025

NicolasHug added 2 commits July 13, 2025 17:21

Nits

c7c0288

Nits

d5e9b75

scotts reviewed Jul 14, 2025

View reviewed changes

examples/decoding/parallel_decoding.py Show resolved Hide resolved

scotts reviewed Jul 14, 2025

View reviewed changes

scotts approved these changes Jul 14, 2025

View reviewed changes

Dan-Flores reviewed Jul 14, 2025

View reviewed changes

NicolasHug added 2 commits July 14, 2025 17:30

branch 'main' of github.com:pytorch/torchcodec into parallelism_tuto

f1db8a6

Address comments

b640060

NicolasHug merged commit b5995d6 into pytorch:main Jul 15, 2025
44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tutorial for parallel decoding #778

Add tutorial for parallel decoding #778

Uh oh!

NicolasHug commented Jul 13, 2025

Uh oh!

Uh oh!

scotts Jul 14, 2025

Uh oh!

NicolasHug Jul 14, 2025 •

edited

Loading

Uh oh!

scotts Jul 14, 2025

Uh oh!

scotts Jul 14, 2025

Uh oh!

NicolasHug Jul 14, 2025

Uh oh!

scotts Jul 14, 2025

Uh oh!

scotts Jul 14, 2025

Uh oh!

NicolasHug Jul 14, 2025

Uh oh!

scotts left a comment

Uh oh!

Dan-Flores Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

Add tutorial for parallel decoding #778

Add tutorial for parallel decoding #778

Uh oh!

Conversation

NicolasHug commented Jul 13, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NicolasHug Jul 14, 2025 •

edited

Loading