Skip to content

Commit 6eafa6e

Browse files
Use a smaller dataset
1 parent 1a4822c commit 6eafa6e

File tree

4 files changed

+16
-36111
lines changed

4 files changed

+16
-36111
lines changed

examples/quotes/README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ generates an embedding using the
1616
[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
1717
Sentence Transformers model.
1818

19-
The dataset has about 37,000 famous quotes, each with their author and tags. The
20-
data originates from a
19+
The dataset has about 850 famous quotes, each with their author and tags. The
20+
data is a subset of a
2121
[Kaggle dataset](https://www.kaggle.com/datasets/akmittal/quotes-dataset) that
2222
appears to have been generated from quotes that were scraped from the Goodreads
2323
[popular quotes](https://www.goodreads.com/quotes) page.
@@ -81,9 +81,8 @@ npm run ingest
8181
Note that the `ELASTICSEARCH_URL` variable must be defined in the terminal
8282
session in which you run this command.
8383

84-
This task may take a few minutes. How long it takes depends on your computer
85-
speed and wether you have a GPU, which is used to generate the embeddings if
86-
available.
84+
This task should take a minute or less. The GPU, if available, is used optimize
85+
the generation of the embeddings.
8786

8887
### Start the back end
8988

0 commit comments

Comments
 (0)