Skip to content

Some problems. #7

@1flei

Description

@1flei

Could I import this module as a part of a course project.

In addition, when I try to test it, I find there seem to be some problems in this module and, so, I recode the _get_sig method as follow
def _get_sig(self,shingle_vec,num_perms):
"""
recoded version of _get_sig
"""
sig = [self._sbucket_size]*num_perms
keys = sorted(shingle_vec.keys())
for r in keys:
#logging.debug('r=%d', r)
h = np.array([hash((r,mask)) % self._sbucket_size for mask in self._memomask])
#logging.debug('h=%s',h)
for i in range(num_perms):
if (h[i] < sig[i]):
sig[i] = h[i]
#logging.debug('mhash=%s',sig)
return sig

and I do not think naming a shingle by the increacing order instead of the random order is a good idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions