Add F5 TTS pipeline #11958

ayushtues · 2025-07-19T09:46:08Z

What does this PR do?

Add F5 TTS #10043

ayushtues · 2025-07-19T15:42:45Z

Okay, got all the code which is needed in two files, and used existing diffusers primitives in some easy to catch places. Now will work on integrating it in the diffusers class structure

src/diffusers/models/transformers/f5tts_transformer.py

ayushtues · 2025-07-21T16:32:32Z

Attention!

Seems like we can use the diffusers Attention class directly, but need to add a new Processor to support RoPE embeds on selective heads as in F5

ayushtues · 2025-07-26T09:32:06Z

Tokenization

F5 uses a character level tokenizer for the text, might want to write a simple tokeniser class for it.

Might just be fine to keep it in a simple function for now, since its very straightforward.

ayushtues · 2025-07-29T02:46:30Z

Tests

Basic structure looks good now, let's add some tests, and then make it more diffusers friendly! Adding tests would also force me to follow the structure more strongly and ensure that the code is not buggy

ayushtues · 2025-07-29T02:51:55Z

Flow matching/Schedulers

Will also need to use one of the schedulers from Diffusers, I think they use simple Euler method only, but the sway sampling step needs to be accounted for somehow, although its just a change in the discretisation schedule so should be straightforward

ayushtues · 2025-07-29T02:58:12Z

Future work

Support streaming (already there in OG F5 repo), although this is more like chunk based inference really. Current model is non-causal so only chunk based streaming makes sense anyway
Triton server inference, again already there in the F5 repo

ayushtues · 2025-08-03T15:14:07Z

Current status

Pipeline forward pass working
Checkpoint converted to hf format
Same forward passes from OG f5 and pipeline
scheduler

To do

Tests

ayushtues · 2025-08-10T11:55:47Z

Got the same forward passes as the OG F5! Next to write some tests

ayushtues · 2025-08-16T11:56:09Z

Scheduler done! FlowMatchEulerDiscreteScheduler is what we want to use, with slight modifications for sway sampling

ayushtues · 2025-08-21T15:58:48Z

@asomoza I was writing some tests for this and was confused about why in the common test _test_attention_slicing_forward_pass the generator_device is set to cpu, while the torch_device can be anything. This seems to be breaking things for me at the moment if my device has cuda or mps in case of a Mac.

Ref: https://github.com/ayushtues/diffusers/blob/cde02b061b6f13012dfefe76bc8abf5e6ec6d3f3/tests/pipelines/test_pipelines_common.py#L1551

Same is true for some other tests too which set the generator_device to cpu

ayushtues · 2025-08-22T05:16:28Z

Also any suggestions on how to add the character level tokenisation of F5, its just a simple character to index lookup, but not sure if to make a new tokeniser class for it, or just save it as a dict and load it somehow

Add placeholder files

ef97fc7

ayushtues mentioned this pull request Jul 19, 2025

F5-TTS Integration #10043

Open

2 tasks

ayushtues added 2 commits July 19, 2025 18:33

Add first version of self-contained F5 DiT

0156dcb

Add F5 pipeline code in a single file

1450cf2

ayushtues commented Jul 19, 2025

View reviewed changes

src/diffusers/models/transformers/f5tts_transformer.py Outdated Show resolved Hide resolved

Use diffusers attention

c7ee594

Remove conditioning encoding from DiT

380963d

Integrate CFM into the pipeline class

ce4237f

ayushtues added 3 commits August 2, 2025 18:44

F5 pipeline definition working

f127e45

Make forwrad pass work

7fe21e6

Add ckpt conversion script

146df74

Make forward passes the same

16e15b0

ayushtues added 2 commits August 15, 2025 17:50

check inputs and prepare latents

3cde114

Add scheduler

22caf6a

Use diffusers GRN

e8fdac4

ayushtues marked this pull request as ready for review August 25, 2025 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add F5 TTS pipeline #11958

Add F5 TTS pipeline #11958

Uh oh!

ayushtues commented Jul 19, 2025

Uh oh!

ayushtues commented Jul 19, 2025

Uh oh!

Uh oh!

ayushtues commented Jul 21, 2025 •

edited

Loading

Uh oh!

ayushtues commented Jul 26, 2025 •

edited

Loading

Uh oh!

ayushtues commented Jul 29, 2025 •

edited

Loading

Uh oh!

ayushtues commented Jul 29, 2025 •

edited

Loading

Uh oh!

ayushtues commented Jul 29, 2025

Uh oh!

ayushtues commented Aug 3, 2025 •

edited

Loading

Uh oh!

ayushtues commented Aug 10, 2025

Uh oh!

ayushtues commented Aug 16, 2025

Uh oh!

ayushtues commented Aug 21, 2025 •

edited

Loading

Uh oh!

ayushtues commented Aug 22, 2025

Uh oh!

Uh oh!

Add F5 TTS pipeline #11958

Are you sure you want to change the base?

Add F5 TTS pipeline #11958

Uh oh!

Conversation

ayushtues commented Jul 19, 2025

What does this PR do?

Uh oh!

ayushtues commented Jul 19, 2025

Uh oh!

Uh oh!

ayushtues commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Attention!

Uh oh!

ayushtues commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tokenization

Uh oh!

ayushtues commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Uh oh!

ayushtues commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Flow matching/Schedulers

Uh oh!

ayushtues commented Jul 29, 2025

Future work

Uh oh!

ayushtues commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented Aug 10, 2025

Uh oh!

ayushtues commented Aug 16, 2025

Uh oh!

ayushtues commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented Aug 22, 2025

Uh oh!

Uh oh!

ayushtues commented Jul 21, 2025 •

edited

Loading

ayushtues commented Jul 26, 2025 •

edited

Loading

ayushtues commented Jul 29, 2025 •

edited

Loading

ayushtues commented Jul 29, 2025 •

edited

Loading

ayushtues commented Aug 3, 2025 •

edited

Loading

ayushtues commented Aug 21, 2025 •

edited

Loading