Skip to content

Conversation

@afang-story
Copy link
Contributor

DCLM-pool s3 paths have been updated, this commit reflects those changes in the readme and in data/competition_pools/preextracted

Old path: s3://commoncrawl/contrib/datacomp/DCLM-pool/crawl=*
New path: s3://commoncrawl/contrib/datacomp/DCLM-pool/jsonl/crawl=*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants