Add python path in scripts (#2625)

ZwwWayne · web-flow · commit 3b53fe15d878 · 2020-05-06T17:40:33.000+08:00
* add python path in scripts

* Make python path work

* Shorter path

* Add docs

* Change dist scripts

* Change slurm example
diff --git a/README.md b/README.md
@@ -40,7 +40,7 @@ This project is released under the [Apache 2.0 license](LICENSE).
 
 ## Changelog
 
-v2.0.0 was released in 5/5/2020.
+v2.0.0 was released in 6/5/2020.
 Please refer to [changelog.md](docs/changelog.md) for details and release history.
 
 ## Benchmark and model zoo
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -1,6 +1,6 @@
 ## Changelog
 
-### v2.0.0 (4/5/2020)
+### v2.0.0 (6/5/2020)
 In this release, we made lots of major refactoring and modifications.
 
 1. **Faster speed**. We optimize the training and inference speed for common models, achieving up to 30% speedup for training and 25% for inference. Please refer to [model zoo](model_zoo.md#comparison-with-detectron2) for details.
diff --git a/docs/getting_started.md b/docs/getting_started.md
@@ -288,13 +288,13 @@ Difference between `resume-from` and `load-from`:
 If you run MMDetection on a cluster managed with [slurm](https://slurm.schedmd.com/), you can use the script `slurm_train.sh`. (This script also supports single machine training.)
 
 ```shell
-./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} [${GPUS}]
+[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
 ```
 
 Here is an example of using 16 GPUs to train Mask R-CNN on the dev partition.
 
 ```shell
-./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x 16
+GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x
 ```
 
 You can check [slurm_train.sh](https://github.com/open-mmlab/mmdetection/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
@@ -330,8 +330,8 @@ dist_params = dict(backend='nccl', port=29501)
 Then you can launch two jobs with `config1.py` ang `config2.py`.
 
 ```shell
-CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} 4
-CUDA_VISIBLE_DEVICES=4,5,6,7 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} 4
+CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
+CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
 ```
 
 ## Useful tools
diff --git a/docs/install.md b/docs/install.md
@@ -128,16 +128,10 @@ pip install -v -e .
 
 ### Using multiple MMDetection versions
 
-If there are more than one mmdetection on your machine, and you want to use them alternatively, the recommended way is to create multiple conda environments and use different environments for different versions.
+The train and test scripts already modify the `PYTHONPATH` to ensure the script use the MMDetection in the current directory.
 
-Another way is to insert the following code to the main scripts (`train.py`, `test.py` or any other scripts you run)
-```python
-import os.path as osp
-import sys
-sys.path.insert(0, osp.join(osp.dirname(osp.abspath(__file__)), '../'))
-```
+To use the default MMDetection installed in the environment rather than that you are working with, you can remove the following line in those scripts
 
-Or run the following command in the terminal of corresponding folder to temporally use the current one.
 ```shell
-export PYTHONPATH=`pwd`:$PYTHONPATH
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
 ```
diff --git a/tools/dist_test.sh b/tools/dist_test.sh
@@ -1,11 +1,10 @@
 #!/usr/bin/env bash
 
-PYTHON=${PYTHON:-"python"}
-
 CONFIG=$1
 CHECKPOINT=$2
 GPUS=$3
 PORT=${PORT:-29500}
 
-$PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
+python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
     $(dirname "$0")/test.py $CONFIG $CHECKPOINT --launcher pytorch ${@:4}
diff --git a/tools/dist_train.sh b/tools/dist_train.sh
@@ -1,10 +1,9 @@
 #!/usr/bin/env bash
 
-PYTHON=${PYTHON:-"python"}
-
 CONFIG=$1
 GPUS=$2
 PORT=${PORT:-29500}
 
-$PYTHON -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
+python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
     $(dirname "$0")/train.py $CONFIG --launcher pytorch ${@:3}
diff --git a/tools/slurm_test.sh b/tools/slurm_test.sh
@@ -1,7 +1,7 @@
 #!/usr/bin/env bash
 
 set -x
-export PYTHONPATH=`pwd`:$PYTHONPATH
+
 PARTITION=$1
 JOB_NAME=$2
 CONFIG=$3
@@ -12,6 +12,7 @@ CPUS_PER_TASK=${CPUS_PER_TASK:-5}
 PY_ARGS=${@:5}
 SRUN_ARGS=${SRUN_ARGS:-""}
 
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
 srun -p ${PARTITION} \
     --job-name=${JOB_NAME} \
     --gres=gpu:${GPUS_PER_NODE} \
diff --git a/tools/slurm_train.sh b/tools/slurm_train.sh
@@ -6,12 +6,13 @@ PARTITION=$1
 JOB_NAME=$2
 CONFIG=$3
 WORK_DIR=$4
-GPUS=${5:-8}
+GPUS=${GPUS:-8}
 GPUS_PER_NODE=${GPUS_PER_NODE:-8}
 CPUS_PER_TASK=${CPUS_PER_TASK:-5}
 SRUN_ARGS=${SRUN_ARGS:-""}
-PY_ARGS=${PY_ARGS:-"--validate"}
+PY_ARGS=${@:5}
 
+PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
 srun -p ${PARTITION} \
     --job-name=${JOB_NAME} \
     --gres=gpu:${GPUS_PER_NODE} \