Skip to content

Conversation

hyli666
Copy link

@hyli666 hyli666 commented Apr 12, 2022

Hi, thanks for sharing codes. I tried to revise the model according to your paper.
1). Fixed KL divergency calculation bug.
2). Added feature for supporting batch processing. (input: [batch, N, d])
3). Fixed problem when the dimension of input is not equal to that of Transformer. ( d != d_model)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant