Skip to content

Multi-gpu pretraining of Transformers to study positional encoding strategies

Notifications You must be signed in to change notification settings

JacksonKaunismaa/transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers replication

A small repository for Transformer pre-training experiments. Inspired by nanoGPT, but with some extensions like alternate position encoding, optimizer sharding, gradient checkpointing, and experimental infrastructure.

It also includes code to work with the commaVQ dataset for video modelling.

Works for Pytorch 2.1

About

Multi-gpu pretraining of Transformers to study positional encoding strategies

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published