Recreate the decoder-only transformer from bottom up
Generate coherent WikiHow articles
Wikihow corpus
Sub par text generation results because of compute constraints (my potato laptop)
Use GPU
Use BPE, Wordpiece etc for tokenization. The character level tokenization method is simplistic and fails to capture statistics of the corpus