Skip to content

Enhancement: support for pretrained word embeddings #136

@simonmandlik

Description

@simonmandlik

Implement a new Extractor subtype, called WordEmbeddingExtractor, for extracting NLP words using their embeddings (using Embeddings.jl and WordTokenizers.jl?)

Rough sketch of possible implementation can be found here, but this is for the old version of JsonGrinder.

A good starting point is NGramExtractor implementation, the design should be very similar.

We might also want to update suggestextractor with a new kwarg governing when Strings are extracted as ngrams and when they are tokenized

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions