forked from anantn/hn-chatgpt-plugin
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Feature Summary
- Category: Data Collection
- Priority: High
Background and Objective
- Problem Statement: HN posts need to be represented optimally for later unsupervised segmentation
- Objective: Determine what additional information besides post title and URL, if any, should be encoded
Technical Requirements
- Languages and Frameworks: Python - sklearn, PyTorch
- Dependencies:
- representative extract of data to test
- Embedding model selected
Tests and Evaluation Metrics
Evaluation Metrics: (e.g., Inertia, Silhouette Score for
- Subjective look
- Downstream performance in Find optimal k for k-means clustering #12 for clusters
- Potentially try simple classification tasks e.g. zero-shot classification
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels