https://github.com/google/maxtext/blob/26a5292bd9f55bd3ff9b2010cd7a86491e9c7b15/end_to_end/tpu/llama2/70b/2_test_llama2_70b.sh#L63 https://github.com/google/maxtext/blob/main/MaxText/scratch_code/golden_llama2-70b_export.py thats what maxtext does to guarantee that the implementation is identical to official llama 2 70B It would be good to do the same to ensure that AXLearn models are proper implementations too.