Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse tags that span cues #1

Open
kmahelona opened this issue Sep 17, 2017 · 3 comments
Open

Parse tags that span cues #1

kmahelona opened this issue Sep 17, 2017 · 3 comments
Assignees
Labels

Comments

@kmahelona
Copy link
Contributor

We need to be able to parse tags that span cues. I don't believe this is implemented based on the tests.

20
00:01:53.870 --> 00:01:59.164
He kōrero nui hoki tērā

21
00:01:59.164 --> 00:02:06.966
taua wāhi rā, nā ōku mātua, ā, <c.ingoatupuna>Ngārama Te Maru</c> me tana wahine <c.ingoatupuna>Matire

22
00:02:06.966 --> 00:02:12.353
Rapihana</c>, i tuku mai hei tūnga whare mō te kāinga

Here, <c.ingoatupuna>Matire Rapihana</c> spans cues 21 and 22.

@kmahelona
Copy link
Contributor Author

I'd write a test but I'm not sure how to do that as it seems like the parsing converts the webvtt to a list of cues. Does a new line in the tests represent a joined list? E.g. could I write a test,

test_string = "this is a <c.tag>wonderful" /
              "tag</c> that spans a line break."

@gregplaysguitar
Copy link
Contributor

Hmm. This is going to be pretty tricky because I'm using a 3rd party lib to parse the webvtt file (pyvtt) and that's what creates the list of cues. I'll have to give it some thought.

To test it, I think you could test the parse_transcript function, possibly passing a file-like object containing the sample webvtt cues rather than an actual file reference? I already do something like that here https://github.com/TeHikuMedia/django-colloquial/blob/master/colloquial/colloquialisms/tests/test_parser.py#L224

@gregplaysguitar
Copy link
Contributor

This should be fixed now in 4356c77

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants