Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add use_tokenizer to JsonlDataset #3486

Merged
merged 5 commits into from
Aug 9, 2024

Conversation

david-waterworth
Copy link
Contributor

This PR adds use_tokenizer argument to the SequenceTagger JSONL datasets. This allows use of non-default tokenization with taggers trained using JSONL datasets.

Closes #3476

@alanakbik
Copy link
Collaborator

Hello @david-waterworth thanks for contributing! One of our checks is failing since the code was not formatted with black.

(See https://github.com/flairNLP/flair/blob/master/CONTRIBUTING.md#code-formatting)

To fix it, could you run black to auto-format your code and push again?

@david-waterworth
Copy link
Contributor Author

@alanakbik I've fixed the formatting

@alanakbik alanakbik merged commit 2d27050 into flairNLP:master Aug 9, 2024
1 check passed
@alanakbik
Copy link
Collaborator

@david-waterworth thanks for adding this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: JsonlDataset cannot pass tokenizer
2 participants