Skip to content

Commit 47a2d51

Browse files
committed
update readme
1 parent f1239cf commit 47a2d51

File tree

1 file changed

+17
-5
lines changed

1 file changed

+17
-5
lines changed

efficientdet/tf2/README.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,19 @@ If you want to continue to train the model, simply re-run the above command beca
264264

265265
Just add ```--strategy=gpus```
266266

267-
## 10. Training EfficientDets on TPUs.
267+
## 10. Train on multi node GPUs.
268+
Following scripts will start a training task with 2 nodes.
269+
270+
Start Chief training node.
271+
```
272+
python -m tf2.train --strategy=multi-gpus --worker=server_address1:12345,server_address2:23456 --worker_index=0 --mode=train --train_file_pattern=tfrecord/pascal*.tfrecord --model_name=efficientdet-d0 --model_dir=/tmp/efficientdet-d0 --batch_size=64 --num_examples_per_epoch=5717 --num_epochs=50 --hparams=voc_config.yaml
273+
```
274+
Start the other training node.
275+
```
276+
python -m tf2.train --strategy=multi-gpus --worker=server_address1:12345,server_address2:23456 --worker_index=1 --mode=train --train_file_pattern=tfrecord/pascal*.tfrecord --model_name=efficientdet-d0 --model_dir=/tmp/efficientdet-d0_1 --batch_size=64 --num_examples_per_epoch=5717 --num_epochs=50 --hparams=voc_config.yaml
277+
```
278+
279+
## 11. Training EfficientDets on TPUs.
268280

269281
To train this model on Cloud TPU, you will need:
270282

@@ -286,7 +298,7 @@ For more instructions about training on TPUs, please refer to the following tuto
286298

287299
* EfficientNet tutorial: https://cloud.google.com/tpu/docs/tutorials/efficientnet
288300

289-
## 11. Reducing Memory Usage when Training EfficientDets on GPU.
301+
## 12. Reducing Memory Usage when Training EfficientDets on GPU.
290302

291303
EfficientDets use a lot of GPU memory for a few reasons:
292304

@@ -306,7 +318,7 @@ If set to True, keras model uses ```tf.recompute_grad``` to achieve gradient che
306318
Testing shows that:
307319
* It allows to train a d7x network with batch size of 2 on a 11Gb (1080Ti) GPU
308320

309-
## 12. Visualize TF-Records.
321+
## 13. Visualize TF-Records.
310322

311323
You can visualize tf-records with following commands:
312324

@@ -331,7 +343,7 @@ python dataset/inspect_tfrecords.py --file_pattern dataset/sample.record\
331343
* save_samples_dir: save dir.
332344
* eval: flag for eval data.
333345

334-
## 13. Export to ONNX
346+
## 14. Export to ONNX
335347
(1) Install tf2onnx
336348
```
337349
pip install tf2onnx
@@ -352,7 +364,7 @@ nms_configs:
352364
python -m tf2onnx.convert --saved-model=<saved model directory> --output=<onnx filename> --opset=11
353365
```
354366

355-
## 14. Debug
367+
## 15. Debug
356368
Just add ```--debug``` after command, then you could use pdb debug the model with eager execution and deterministic operations.
357369

358370
NOTE: this is not an official Google product.

0 commit comments

Comments
 (0)