[New Pipeline]: Audio-Journey: Visual+LLM-aided Audio Encodec Diffusion

### Model/Pipeline/Scheduler description

 We efficiently trained an Audio Diffusion model with the aid of Alpaca augmented audio captions using AudioSet labels;
[website](https://audiojourney.github.io/)
[preprint](https://github.com/audiojourney/audiojourney.github.io/blob/main/neurIPS_2023_v1.2.pdf)
[Appendix](https://github.com/audiojourney/audiojourney.github.io/blob/main/neurIPS_2023_appendix_v1.3.pdf)
[Implementation](https://github.com/jacksonmichaels/diffusers_with_dataloader)
Weights will be released soon!

### Open source status

- [X] The model implementation is available
- [ ] The model weights are available (Only relevant if addition is not a scheduler).

### Provide useful links for the implementation
@jacksonmichaels

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[New Pipeline]: Audio-Journey: Visual+LLM-aided Audio Encodec Diffusion #3826

Model/Pipeline/Scheduler description

Open source status

Provide useful links for the implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[New Pipeline]: Audio-Journey: Visual+LLM-aided Audio Encodec Diffusion #3826

Description

Model/Pipeline/Scheduler description

Open source status

Provide useful links for the implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions