Skip to content

Plagiarism detection (external)#24

Open
riqkum wants to merge 6 commits intovseloved:masterfrom
riqkum:plagiarism-detection
Open

Plagiarism detection (external)#24
riqkum wants to merge 6 commits intovseloved:masterfrom
riqkum:plagiarism-detection

Conversation

@riqkum
Copy link
Copy Markdown
Contributor

@riqkum riqkum commented May 11, 2017

No description provided.

riqkum added 6 commits May 11, 2017 18:44
- consider all words (min. word length use only to fill the hash
  table);
- mismatch count credit parameter is introduced;
- skip already processed chunks - this should decrease algorithmic
  complexity.
- do not collect the matching words, only bounds;
- added type hints to get the optimized code;
- parallel scanning support.
- source index record is shrinked: it does not store word lists,
  instead they are reloaded on request;
- traverse of suspicious text's word list is not sequential, this is
  due to performance reason: it jumps across words found in the
  currently processed source.
- parameters values are increased to get faster processing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant