This repository requires dask and transformers to run. All scripts relevant to running the code for this paper are in predictable_memorization/scripts. Figures were generated by notebooks in predictable_memorization/notebooks primarily analysis.ipynb
indices.py -> postprocess_repeats.py -> complexity.py -> combined_data.py