Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when running demo - add_analogies #10

Open
SolveigHelene opened this issue Oct 16, 2024 · 9 comments
Open

error when running demo - add_analogies #10

SolveigHelene opened this issue Oct 16, 2024 · 9 comments

Comments

@SolveigHelene
Copy link

SolveigHelene commented Oct 16, 2024

I have come across a error messages when evaluating on sound analogies in the demo script in add_analogies.py.

The error message is:
/pwesuite/create_dataset/add_analogies.py", line 154, in find_single_perturbation_pairs
feature_vectors[tuple(ft.fts(x).numeric())].append(x)
^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'numeric'

The problem seems to occur when it runs feature_vectors[tuple(ft.fts(😕).numeric())].append(x), since ft.fts(😕) returns an empty dictionary.

@zouharvi
Copy link
Owner

Hi, sorry for the late response, I didn't see the notification before.

Did you get some further output? I got no errors running the following:

python3 ./create_dataset/preprocess.py
python3 ./create_dataset/add_analogies.py

Could there be some panphon version mismatch? I have panphon==0.2.0 and panphon2==0.3.2.

@zouharvi
Copy link
Owner

Also, mind that you do not have to recreate the dataset locally. It's much easier to just use the prepared one online, as shown in the demo.

import datasets
data = datasets.load_dataset("zouharvi/pwesuite-eval", split="train")

@SolveigHelene
Copy link
Author

This occurs when running demo.ipynb, so I am using the online dataset. So I have run all the cells, and then this occurs when running !python3 ./suite_evaluation/eval_all.py --embd embd.npy

This is the whole output I get:
Loading data
Human similarity
Correlations
100%|█████████████████████████████████████████████| 9/9 [01:24<00:00, 9.41s/it]
Retrieval
100%|█████████████████████████████████████████████| 9/9 [01:14<00:00, 8.26s/it]
Sound analogies
Traceback (most recent call last):
File "/pwesuite/./suite_evaluation/eval_all.py", line 90, in
score, scores_all = evaluate_all(data_multi, data_embd, args.lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pwesuite/./suite_evaluation/eval_all.py", line 59, in evaluate_all
output = evaluate_analogy(
^^^^^^^^^^^^^^^^^
File "/pwesuite/suite_evaluation/eval_analogy.py", line 69, in evaluate_analogy
output[lang] = evaluate_analogy_single_lang(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pwesuite/suite_evaluation/eval_analogy.py", line 11, in evaluate_analogy_single_lang
analogies = get_analogies(data_multi, lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pwesuite/create_dataset/add_analogies.py", line 187, in get_analogies
analogy_model = PhonemeAnalogy(
^^^^^^^^^^^^^^^
File "/pwesuite/create_dataset/add_analogies.py", line 28, in init
self.single_perturbation_pairs = self.find_single_perturbation_pairs(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pwesuite/create_dataset/add_analogies.py", line 159, in find_single_perturbation_pairs
feature_vectors[tuple(ft.fts(x).numeric())].append(x)
^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'numeric'

@zouharvi
Copy link
Owner

Working on a solution now. 🙂

@zouharvi
Copy link
Owner

I removed the dependency to create the analogy and cognate data locally. Instead, it's now all in the HF dataset. This should remove bunch of friction points, hopefully. 🙂

Could you update the repository and run the demo again?

@SolveigHelene
Copy link
Author

It runs without issue now!😁

@zouharvi
Copy link
Owner

Great to hear. Let me know if you discover any other issues. 🙂

Btw, for now (here) I made it so that the dataset gets downloaded every time (to make sure that you get the latest version when running it). Simply remove the download_mode="force_redownload" to just use the already downloaded version.

@JeffBezos64
Copy link
Contributor

Hi @zouharvi , the demo runs without issue but for replication purposes it is currently failing. I believe it is the same issue.

@zouharvi zouharvi reopened this Feb 18, 2025
@zouharvi
Copy link
Owner

zouharvi commented Feb 18, 2025

Hi @JeffBezos64, could you specify what exactly is failing? The values are not the same indeed but that's because lot's of the infra has changed since the paper.

I see now. Getting low scores also for the pre-trained models. Might take me a few weeks to circle back to this to investigate why though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants