-
Notifications
You must be signed in to change notification settings - Fork 266
Creating a Course
TODO Automate this as much as possible.
Let's say we're going to create a new course on Kaggle Learn called Data Science.
- On the command line, navigate to the
learntools/notebooks/directory. - Create a new branch on
masterwith a name likeds-course. Be sure to check that there isn't already a branch with that name. - Decide on a "track name" like
data_science. This will be the name of the directory where your course files will exist. Check that there isn't already a directory with that name. - There should be a Bash script called
new_track.sh. Run/.new_track.sh data_science. - Stage the new files:
git add data_science. - Commit the changes:
git commit -m "Create track ds-course." - Create a pull request on GitHub named
[Data Science] New course.
- Navigate to the learntools root directory
learntools/(the directory containingsetup.py). - Install an editable version of learntools:
pip install --editable .(note the period). - Navigate to
learntools/learntools. - Create a directory for your course:
mkdir data_science. - Create an initialization file:
touch data_science/__init__.py. - Commit the changes.
Create a folder to contain local copies of the course data: mkdir learntools/notebooks/input. This folder will just be for your own use while developing and won't be committed to the repository (it's in notebooks/.gitignore).
Create a folder for a course dataset: mkdir input/ds-course-data. Put all of the data you plan to use in here. If you develop your notebooks in the raw folder (notebooks/data_science/raw/), then you can access your datasets just like you would on Kaggle, like '../input/ds-course-data/data.csv'.
Add track name 'data_science' to TRACKS and TESTABLE_NOTEBOOK_TRACKS in learntools/notebooks/test.sh.
Create a new file setup_data.sh in learntools/notebooks/data_science/:
#!/bin/bash
# Download the datasets used in the ML notebooks to correct relative_paths (../input/...)
mkdir -p input
DATASETS="ryanholbrook/ds-course-data ryanholbrook/some-other-data"
for slug in $DATASETS
do
name=`echo $slug | cut -d '/' -f 2`
dest="input/$name"
mkdir -p $dest
kaggle d download -p $dest --unzip $slug
doneYou'll need to keep this list of datasets in DATASETS up-to-date with those you use in your course (that is, those defined in track_meta.py).