Skip to content

Little bugs of current codes, possible fix attached  #1

@shrango

Description

@shrango
  1. The name of "phrase" and "mapping" files are wrong, you may:
    change rename tree-None.tree $src-$tgt.mapping.$src tmp/intra-data/* in preprocess_group.sh into rename tree-None.tree mapping.$src-$tgt.$src tmp/intra-data/*;
    and change rename tree-None.tree $src-$tgt.phrase.$src tmp/intra-data/* in preprocess_group.sh into rename tree-None.tree phrase.$src-$tgt.$src tmp/intra-data/*

  2. Actually, "rename" operations may not be permitted, at least in my system. Here is my solution, FYI:

cp tmp/translation-data/* $tag/
cd tmp/intra-data
for split in train valid test
do
        for suffix in idx bin
        do
                mv $split.tree-None.tree.$suffix $split.mapping.$src-$tgt.$src.$suffix
        done
done
mv dict.tree.txt dict.tree.$src.txt
cp * ../../$tag

cd ../inter-data

for split in train valid test
do
        for suffix in idx bin
        do
                mv $split.$src-None.$src.$suffix $split.phrase.$src-$tgt.$src.$suffix
        done
done
mv dict.$src.txt dict.phrase.$src.txt
  1. I believe you have run all your programs with --fp16, because I find the following Error.
 File "/data/UMST/fairseq/models/PhraseTransformer.py", line 1138, in forward
    x_word = self.downsampling(x, bpe2word_scale)
  File "/data/UMST/fairseq/models/PhraseTransformer.py", line 1057, in downsampling
    return ((mapping.transpose(1,2)@sub_word_representation.transpose(0,1)) / (torch.sqrt(num_matrix.unsqueeze(2)))).transpose(0,1)
RuntimeError: expected scalar type Float but found Half

The current code works smoothly with --fp16, yet you may want to strengthen its compatibility by fixing this bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions