Skip to content

As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)

License

Notifications You must be signed in to change notification settings

wietsedv/gpt2-recycle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT-2 Recycled for Italian and Dutch

Wietse de VriesMalvina Nissim

📝 As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages [Findings of ACL 2021]

Model description

In our paper, we describe a multi-stage adaptation method for transfering GPT-2 to Italian and Dutch without unnecessary retraining. This repository contains the source code and the final models are available on the Hugging Face model hub (see below).

We publish two types of models:

  • Models where only the lexical layer is retrained for the new language and the Transformer layers are the same as the English model. The lexical layers of these models are in practice automatically aligned with the equivalent English model. Use this if you are interested in alignment properties.
  • Models with retrained lexical embeddings and then additional training of the full models. Use this if you want to generate more realistic text.

For details, check out our Findings of ACL paper and the models on the 🤗 Hugging Face model hub (see links for specific models below).

Models

Dutch

Italian

How to use

from transformers import pipeline

pipe = pipeline("text-generation", model="GroNLP/gpt2-small-dutch")
print(pipe('Was ik maar een'))
from transformers import AutoTokenizer, AutoModel, TFAutoModel

tokenizer = AutoTokenizer.from_pretrained("GroNLP/gpt2-small-dutch")
model = AutoModel.from_pretrained("GroNLP/gpt2-small-dutch")  # PyTorch
model = TFAutoModel.from_pretrained("GroNLP/gpt2-small-dutch")  # Tensorflow

BibTeX entry

@inproceedings{de-vries-nissim-2021-good,
    title = "As Good as New. How to Successfully Recycle {E}nglish {GPT}-2 to Make Models for Other Languages",
    author = "de Vries, Wietse  and
      Nissim, Malvina",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.74",
    doi = "10.18653/v1/2021.findings-acl.74",
    pages = "836--846",
}

About

As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published