Skip to content

Retrain the models with more data (from festcat) #8

@gullabi

Description

@gullabi

The training data of Catotron comes from festcat, but all data has been used. Simply the very long segments in the festcat data have been omitted. This might be causing the following problems:

  • The failure of the attention from time to time, meaning non synthesized or only partly synthesized segments.
  • The lack of prosodic difference between questions and normal sentences

With some smart parsing approx 4 hours per speaker should be able to be augmented (up to 10 hours per speaker).

This task was already mentioned in the large roadmap issue, and I open this specific issue to follow the developments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions