Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attention mask unused? #5

Open
codeninja opened this issue Nov 14, 2020 · 1 comment
Open

Attention mask unused? #5

codeninja opened this issue Nov 14, 2020 · 1 comment

Comments

@codeninja
Copy link

        ass_mask=torch.ones(q_size2*q_size1,1,1,q_size0).cuda()  #[31*128,1,1,11]
        x, self.attn_asset = attention(ass_query, ass_key, ass_value, mask=None, 
                             dropout=self.dropout)   

Within MultiHeadedAttention the ass_mask is not being passed into the attention method here and appears as if it's unused. IIUC the attention mask is necessary to prevent look ahead bias in the attention mechanism and should be masking off future values when calculating attention.

If this mask is unused, what was it's intent? Where is attention being masked? And how should that be appied?

@Ivsxk
Copy link
Owner

Ivsxk commented Nov 25, 2020

This mask is indeed not used since we make portfolio decision15 minutes ahead, while is not a long sequence prediction like the translation. This mask is just a not mature attempt at providing a long term strategy,you can just ignore it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants