Skip to content

Releases: omrilahav/MinHash

Fixed Basic Version

16 Jul 21:12

Choose a tag to compare

Min-Hash Simple Example

This is the first release of this python min-Hashing module, that can be used for simple and complicated tasks of similarity comparisons between objects.

What it contains

  1. Example of simple execution of the module, from generating min-Hash signatures to similarity calculations.

Supported Data Types

  1. Text documents
  2. Raw files (bytes)
    (However, it is very easy to add support for additional data types by yourself. All the details for doing so are written inside the code)

Future Release

  1. Easy-to-Install .whl file, to be used in your python environment
  2. Supporting additional data types

ICSML Solution

09 Jul 19:03

Choose a tag to compare

International Cyber Security and Machine Learning Program

Ben-Gurion University of the Negev, Israel

This is the solution for the task (https://github.com/omrilahav/MinHash/releases/tag/v1.0.0-1) written for the students of the ICSML program @ BGU.

In this task you were required to support the min-Hashing of files (raw data - the bytes of the files).

ICSML Task

09 Jul 18:59

Choose a tag to compare

International Cyber Security and Machine Learning Program

Ben-Gurion University of the Negev, Israel

This is a task written for the students of the ICSML program @ BGU.

In this task you are required to support the min-Hashing of files (raw data - the bytes of the files).
Reading the documentations and examples in the code will help you succeed the task.

Good Luck :)

Min-Hash Simple Example

09 Jul 18:52

Choose a tag to compare

Min-Hash Simple Example

This is the first release of this python min-Hashing module, that can be used for simple and complicated tasks of similarity comparisons between objects.

This simple example was written for the use in the International Cyber Security & Machine Learning program @ Ben-Gurion University of the Negev, Israel.

What it contains

  1. Example of simple execution of the module, from generating min-Hash signatures to similarity calculations.

Supported Data Types

  1. Text documents
  2. Raw files (bytes)
    (However, it is very easy to add support for additional data types by yourself. All the details for doing so are written inside the code)

Future Release

  1. Easy-to-Install .whl file, to be used in your python environment
  2. Supporting additional data types