This repository was archived by the owner on Aug 5, 2022. It is now read-only.
  
  
  - 
                Notifications
    You must be signed in to change notification settings 
- Fork 486
How to create ImageNet LMDB
        Feng Zou edited this page Nov 17, 2017 
        ·
        10 revisions
      
    LMDB (Lightning Memory – Mapped Database) is a key-value store database, supported by Intel distribution of Caffe*. One of the most advantage of this solution is its high-throughput. Trainings and validation datasets can be converted to the form stored in the LMDB.
##General scenario and parameters description Intel distribution of Caffe provides script supporting users with creation of LMDB.
General steps:
- Download training and validation images of ILSVRC2012 from http://image-net.org, after signing up. Each type of files should be stored separately.
- Execute the script for download auxiliary data:
$./data/ilsvrc12/get_ilsvrc_aux.sh- If necessary, perform pre-processing of the training/validation data (e.g., for the images resize height/width).
- Create LMDB with the script as below:
$examples/imagenet/create_imagenet.sh
Before run, please verify following parameters of the script:
- 
TRAIN_DATA_ROOTandVAL_DATA_ROOTvariables point to the path of the training and validation data
- 
resize_height– the height of the image will be resized according to this -resize_width– the width of the image will be resized according to this value -shuffle– if set, during creating LMDB database, entries will be mixed (the order of the entries will be random) -encoded– if true the LMDB will be compressed -$DATA/train.txtor$DATA/val.txt– text file indicates a classification of the images used to training or validation. -$EXAMPLE/ilsvrc12_train_lmdbor$EXAMPLE/ilsvrc12_val_lmdb– the path to the location where LMDB will be saved
- Use the created LMDB in the Intel distribution of Caffe.
##Example execution:
For this guide purposes, examples illustrate this point by importing training and validation data from the ImageNet.
- Download ImageNet training and validation data.
- Navigate to the imagenet directory, e.g.,
cd path/to/caffe/examples/imagenet
- Edit the create_imagenet.shscript, which should contain the following:
TRAIN_DATA_ROOT=/data/imagenet/train/
VAL_DATA_ROOT=/data/imagenet/val/
RESIZE=true
...
ENCODE=true
...
- Run the script, e.g: ./examples/imagenet/create_imagenet.sh
Results of the script run above:
Creating training lmdb...
...
Creating val lmdb...
I1124 10:58:44.212462 193703 convert_imageset.cpp:123] Shuffling data
I1124 10:58:44.219236 193703 convert_imageset.cpp:126] A total of 50000 images.
I1124 10:58:44.219633 193703 db_lmdb.cpp:72] Opened lmdb examples/imagenet/ilsvrc12_val_lmdb
I1124 10:58:51.641278 193703 convert_imageset.cpp:184] Processed 1000 files.
I1124 10:58:58.952800 193703 convert_imageset.cpp:184] Processed 2000 files.
I1124 10:59:05.942912 193703 convert_imageset.cpp:184] Processed 3000 files.
...
Done.
- The ilsvrc12_train_lmdbandilsvrc12_val_lmdbdirectory should be created by the script, in the path according to the setEXAMPLEvariable.
- Update the .prototxtfile of the particular model using in the Intel distribution of Caffe, e.g.,
data_param {
   source: "examples/imagenet/ilsvrc12_train_lmdb"
   batch_size: 256
   backend: LMDB
 }
*Other names and brands may be claimed as the property of others