ELFNet stores large datasets as compressed archives tracked with Git LFS. Regular Git history contains code, small examples, documentation, checksums, and LFS pointers.
| Dataset | Contents | Path |
|---|---|---|
pressure-triplets-326k-v1 |
326,009 pressure SAD/ELF/symmetry triplets, 978,027 .npy files |
release/pressure-triplets-326k-v1/ |
dft-reference-elfs-75k-v1 |
75,000 DFT reference ELFCAR files |
release/dft-reference-elfs-75k-v1/ |
Install Git LFS before cloning, or run git lfs pull after cloning.
git lfs install
git clone git@github.com:Austin243/ELFNet.git
cd ELFNet
git lfs pull --include="release/**"To fetch only one dataset:
git lfs pull --include="release/pressure-triplets-326k-v1/**"dft-reference-elfs-75k-v1 uses one split archive. Verify the part checksums,
reassemble the archive, then extract it:
cd release/dft-reference-elfs-75k-v1
sha256sum -c SHA256SUMS
cat dft-reference-elfs-75k-v1.tar.zst.part-* > dft-reference-elfs-75k-v1.tar.zst
sha256sum -c SHA256SUMS.full
tar --use-compress-program=unzstd -xf dft-reference-elfs-75k-v1.tar.zstpressure-triplets-326k-v1 is split into 32 shard archives. Shard 006 is
itself split into three parts to keep every LFS object below 2 GiB.
cd release/pressure-triplets-326k-v1
sha256sum -c SHA256SUMS
mkdir -p pressure_triplets
for shard in $(seq -f "%03g" 0 31); do
cat pressure-triplets-326k-v1-shard${shard}.tar.zst.part-* > pressure-triplets-326k-v1-shard${shard}.tar.zst
done
sha256sum -c SHA256SUMS.full
for archive in pressure-triplets-326k-v1-shard*.tar.zst; do
tar --use-compress-program=unzstd -xf "$archive" -C pressure_triplets
done
gzip -dk pressure-triplets-326k-v1-manifest.tsv.gzThe triplet archives extract to paired NumPy arrays named:
<stem>_sad.npy
<stem>_elf.npy
<stem>_sym.npy
Current ELFNet training consumes _sad.npy and _elf.npy; _sym.npy is kept
for provenance and compatibility with older SAD2ELF experiments.
The 75k DFT reference archive extracts selected reference ELFCAR files under
paths matching the source Yuanhui structure tree. Use its manifest to map each
archive path back to the selected-set rank, prototype, substituted elements, and
ELF maxima values.