Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is the tokenizer supposed to be used? #733

Open
mark-hahn opened this issue Feb 24, 2025 · 0 comments
Open

How is the tokenizer supposed to be used? #733

mark-hahn opened this issue Feb 24, 2025 · 0 comments

Comments

@mark-hahn
Copy link

I am developing an app that could probably use jscpd. I want to find out the most efficient way to use it. I see that there is a tokenizer available. What use would there be for a tokenizer? Would the search go faster with tokens pre-generated? If so, is this because the tokens are shorter than the original text when doing the Rabin-Karp search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant