Char/Word Embeddings and Basline Systems

Including Character Embeddings, Word Embeddings and Baseline Systems.

Character Embeddings

[2016 AAAI] Char2Vec: Character-Aware Neural Language Models, [paper], sources: [carpedm20/lstm-char-cnn-tensorflow], [yoonkim/lstm-char-cnn].

Word Embeddings

[2008 NIPS] HLBL: A Scalable Hierarchical Distributed Language Model, [paper], [wenjieguan/Log-bilinear-language-models].
[2010 INTERSPEECH] RNNLM: Recurrent neural network based language model, [paper], [Ph.D. Thesis], [slides], sources: [mspandit/rnnlm].
[2013 NIPS] Word2Vec: Distributed Representations of Words and Phrases and their Compositionality, [paper], [word2vec explained], [params explained], [blog], sources: [word2vec], [dav/word2vec], [yandex/faster-rnnlm], [tf-word2vec], [zake7749/word2vec-tutorial].
[2013 CoNLL] Better Word Representations with Recursive Neural Networks for Morphology, [paper].
[2014 ACL] Word2Vecf: Dependency-Based Word Embeddings, [paper], [blog], sources: [Yoav Goldberg/word2vecf], [IsaacChanghau/Word2VecfJava].
[2014 EMNLP] GloVe: Global Vectors for Word Representation, [paper], [homepage], sources: [stanfordnlp/GloVe].
[2014 ICML] Compositional Morphology for Word Representations and Language Modelling, [paper], sources: [thompsonb/comp-morph], [claravania/subword-lstm-lm].
[2015 ACL] Hyperword: Improving Distributional Similarity with Lessons Learned from Word Embeddings, [paper], sources: [Omer Levy/hyperwords].
[2016 ICLR] Exploring the Limits of Language Modeling, [paper], [slides], sources: [tensorflow/models/lm_1b].
[2016 CoNLL] Context2Vec: Learning Generic Context Embedding with Bidirectional LSTM, [paper], sources: [orenmel/context2vec].
[2016 IEEE Intelligent Systems] How to Generate a Good Word Embedding?, [paper], [基于神经网络的词和文档语义向量表示方法研究], [blog], sources: [licstar/compare].
[2016 ArXiv] Linear Algebraic Structure of Word Senses, with Applications to Polysemy, [paper], [slides], sources: [YingyuLiang/SemanticVector].
[2017 ACL] FastText: Enriching Word Vectors with Subword Information, [paper], sources: [facebookresearch/fastText], [salestock/fastText.py].
[2017 ArXiv] Implicitly Incorporating Morphological Information into Word Embedding, [paper].
[2017 AAAI] Improving Word Embeddings with Convolutional Feature Learning and Subword Information, [paper], sources: [ShelsonCao/IWE].
[2018 ICML] Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations, [paper], supplementary, sources: [chentingpc/kdcode-lm].
[2018 ICLR] Compressing Word Embeddings via Deep Compositional Code Learning, [paper], [bibtex], sources: [msobroza/compositional_code_learning].

Language Modeling Systems (Baseline Systems)

[2017 NIPS] Learned in Translation: Contextualized Word Vectors, [paper], sources: [salesforce/cove].
[2018 NAACL] Deep contextualized word representations, [paper], [homepage], sources: [allenai/bilm-tf], [HIT-SCIR/ELMoForManyLangs]. Some extended application: [UKPLab/elmo-bilstm-cnn-crf].
[2018 ArXiv] GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations, [paper], [bibtex].
[2018 ArXiv] Improving Language Understanding by Generative Pre-Training, [paper], [bibtex], [homepage], sources: [openai/finetune-transformer-lm].
[2018 ArXiv] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, [paper], [bibtex], sources: [google-research/bert], [huggingface/pytorch-pretrained-BERT]. Some extended for application: [macanv/BERT-BiLSTM-CRF-NER]. Some blog post:
- daiwk的BERT解读: I, II, III, IV
- Dissecting BERT: I, II, III
- The Illustrated Transformer: EN, CN
- The Annotated Transformer: EN
- 从Word Embedding到BERT - NLP预训练技术发展史: CN
- NLP三大特征抽取器(CNN/RNN/Transformer)比较: CN
[2019 ArXiv] Language Models are Unsupervised Multitask Learners, [paper], [homepage], sources: [openai/gpt-2].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

emb_lm.md

emb_lm.md

Char/Word Embeddings and Basline Systems

Character Embeddings

Word Embeddings

Language Modeling Systems (Baseline Systems)

Files

emb_lm.md

Latest commit

History

emb_lm.md

File metadata and controls

Char/Word Embeddings and Basline Systems

Character Embeddings

Word Embeddings

Language Modeling Systems (Baseline Systems)