Skip to content

Comparison against TF-IDF Vectorizer (using sklearn) #7

@BALaka-18

Description

@BALaka-18

Description

TF-IDF is one of the most famous algorithms when it comes to keyword extraction from text. Your task is to create a function that will extract keywords from text using the TF-IDF algorithm and compare the results against this library. How similar / different are the results ?

For reference :

For your reference, you may read these :

  1. Keyword extraction
  2. TF-IDF Vectorizer - Sklearn docs

Folder Structure, Function details

Create a folder tfidf_vectorizer in the root directory. The folder must contain a .py file that will contain the function for extracting the keywords from text using sklearn's TfidfVectorizer.

Structure : tfidf_vectorizer/extract_keywords_tfidf_sklearn.py

Acceptance Criteria

  • Code must be properly formatted.
  • Code must be accompanied by appropriate comments.
  • File structure must be strictly maintained.
  • Test cases must be present at the end of the code.
  • Variables and functions must be properly named
  • IMPORTANT : Make sure requirements.txt file is updated if you are including any new library.
  • All instructions provided in the Description must be strictly followed.

Definition of Done

  • All of the required items are completed.
  • Approval by 1 mentor.

Time Estimation

1.5 hours

Metadata

Metadata

Assignees

No one assigned

    Labels

    HacktoberfestThis issue is under Hacktoberfest 2020codecode based issuemediumintermediate level issues

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions