Thursday, February 18, 2016

Training Faster R-CNN on Custom Dataset

(this post is being updated as we try to train a custom object detection model based on Faster R-CNN..)

We will illustrate how to train Faster-RCNN on another dataset in the following steps, and we will take SGS as the example dataset.

Format Your Dataset

At first, the dataset must be well organzied with the required format.
SGS
|-- data
    |-- Annotations
         |-- *.txt (Annotation files)
    |-- Images
         |-- *.png (Image files)
    |-- ImageSets
         |-- train.txt
         |-- test.txt
Annotations folder: each file .txt has to be in the same format. For example, cropped_000001.txt

1 200 200 314 300
where 1 should be the class index  in self._classes of sgs.py, followed by 4 numbers indicate top left and bottom right area of target object in the image. These data will be parsed by _load_sgs_annotation in sgs.py

Images folder:
cropped_000011.png
Training fully convolutional networks like Faster RCNN, we can take much bigger images (say 800x600) because we have batch size = 1.


ImageSets folder: the train.txt contains all the names(without extensions) of images files that will be used for training. test.txt contains filenames (withou extensions) for testing. These files are in folder Images.
For example, there are a few lines in train.txt below.
cropped_000011
cropped_000603
cropped_000606
cropped_000607
cropped_000608

Construct IMDB

You need to add a new python file describing the dataset we will use to the directory $FRCNN_ROOT/lib/datasets, see sgs.py. Then the following steps should be taken.
  • Modify self._classes in the constructor function to fit your dataset.
  • Be careful with the extensions of your image files. Then, modify image_path_from_index in sgs.py to locate correct path for input images.
  • Write the function for parsing annotations, see _load_sgs_annotation in sgs.py.
  • Do not forget to add import syntaxes in your own python file and other python files in the same directory.
Then you should modify the factory.py in the same directory. For example, to add SGS, we should add
sgs_devkit_path = '/home/abc/Datasets/SGS'
for split in ['train', 'test']:
    name = '{}_{}'.format('sgs', split)
    __sets[name] = (lambda split=split: datasets.sgs(split, sgs_devkit_path))
Here is example sgs.py

Modify Prototxt and Rename Layers

For example, if you want to use the model VGG16, then you should modify train.prototxt in $FRCNN_ROOT/models/VGG16/faster_rcnn_end2end, it mainly concerns with the number of classes you want to train.

In order to train on your own dataset, you should rename the last layer to reinitialize weights. Rename cls_score and bbox_pred in your train.prototxt. For instance, I rename it to cls_score_sgs and bbox_pred_sgs.


Let's assume that the number of classes is C (do not forget to count the `background` class). Then you should
  • Modify num_classes in 'input-data' and 'roi-data' layer to C;
  • Modify num_output in the cls_score_sgs layer to C
  • Modify num_output in the bbox_pred_sgs layer to 4 * C

You may change parameters in solver.prototxt to fit your needs.

Delete $FRCNN_ROOT/data/cache/train_gt_roidb.pkl to force generating new file.
. 


Train on Custom Dataset!

In the directory $FRCNN_ROOT
Run the $FRCNN/experiments/scripts/faster_rcnn_end2end_imagenet.sh.
The use of .sh file is just the same as the original faster rcnn
./experiments/scripts/sgs_train_faster_rcnn_end2end.sh 1 VGG16
Enjoy the drink and take a break to wait for the training.


Error
floating point exeption
Solve
https://github.com/rbgirshick/py-faster-rcnn/issues/65 
A workaround is removing images that have width or height > 500.

Error
https://github.com/rbgirshick/py-faster-rcnn/issues/85


Solve
https://github.com/rbgirshick/py-faster-rcnn/issues/27

Error

assert (boxes[:, 2] >= boxes[:, 0]).all() AssertionError

Solve
http://stackoverflow.com/questions/31005463/how-to-train-new-fast-rcnn-imageset

Error
ValueError: attempt to get argmax of an empty sequence
Solve
https://github.com/rbgirshick/fast-rcnn/pull/21 
https://github.com/rbgirshick/py-faster-rcnn/issues/27 
https://github.com/rbgirshick/fast-rcnn/pull/102  
https://github.com/rbgirshick/py-faster-rcnn/issues/126


ResNet Implementation for Faster-rcnn

https://github.com/rbgirshick/py-faster-rcnn/issues/62 
https://github.com/rbgirshick/py-faster-rcnn/issues/122


Train On ImageNet dataset
https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md

References
https://github.com/rbgirshick/py-faster-rcnn
https://github.com/andrewliao11/py-faster-rcnn/blob/master/README.md
https://github.com/coldmanck/fast-rcnn/blob/master/README.md
https://github.com/rbgirshick/py-faster-rcnn/issues/62

41 comments:

  1. Nice blog.. I am also trying to train faster-r-cnn on custom dataset. Do you mind sharing your github repo? Thanks.

    ReplyDelete
  2. Hello, thanks for this post but it's quite hard to understand how to train a custom dataset if you mentioned a sgs.py which you didn't show.

    ReplyDelete
  3. You may follow links in the reference to see detail steps. Posts by coldmanck and andrewliao11 are good start. I tested this model 2-3 months ago so I don't remember clearly right now.

    ReplyDelete
  4. Hello, I am having some trouble getting this work. Could you upload your sgs.py file please?

    ReplyDelete
    Replies
    1. This is my sgs.py, just for reference: http://pastebin.com/ug02KH7X


      Delete
  5. Try this: https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md

    ReplyDelete
  6. Could you please give step by step procedure to test a pretrained faster rcnn model on custom dataset.

    ReplyDelete
    Replies
    1. - To test an image, you can modify py-faster-rcnn/tools/demo.py
      - To test a set of images, see the section after NET_FINAL=`grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'` in py-faster-rcnn/experiments/scripts/faster_rcnn_end2end.sh

      Delete
  7. Hi sgsai:
    Great article! I am also training Faster R-CNN on pedestrian data and try to build a pedestrian detector.
    I encounter a problem that I have training image without any pedestrian in it.
    Since these images does not have any annotations for it. What kind of modifications do I need?

    ReplyDelete
  8. Awesome tutorial SGSAI! Thanks.
    Just one thing, the bounding box format in Annotations folder should be 'left top right bottom' according to _load_'dataset'_annotation function, not 'top left bottom right'. Thanks again

    ReplyDelete
    Replies
    1. Hi, what is the difference between 'top left' and 'left top'?

      Delete
    2. left top' indicates x coordinate first, then y coordinate. This is the coordinate of left top corner of the bounding box containing target object

      Delete
  9. Hey is there any way you could post your sgs_train_faster_rcnn_end2end.sh ? I can get everything else properly modified but I am unsure how to modify the script.

    Thanks!

    ReplyDelete
  10. Can you upload any annotation file?
    Does index start from 1 and keep on increasing per image?

    ReplyDelete
    Replies
    1. This is from a annotation file: "2 158 424 382 687"
      If we have 2 classes, there will be 3 indexes: 0 for 'background', 1 for 'class 1' and 2 for 'class 2'

      Delete
  11. Many thanks for the detail instructions! it really worked and I was able to train on INRIA dataset.
    Please advice me on how testing an image using the trained model. I am really struck with it.
    Many thanks!

    ReplyDelete
    Replies
    1. - To test an image, you can modify py-faster-rcnn/tools/demo.py
      - To test a set of images, see the section after NET_FINAL=`grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'` in py-faster-rcnn/experiments/scripts/faster_rcnn_end2end.sh

      Delete
  12. Thanks for this post. But I have some questions. Even though I followed your post until 'Modify Prototxt and Rename Layers', I don't understand why the 'faster_rcnn_end2end_imagenet.sh' file suddenly come up. I'm using py-faster-rcnn made by 'rbgirshick' here(https://github.com/rbgirshick/py-faster-rcnn) but you seems like using other version. Can you let me know which version you are using and how the 'sgs_train_faster_rcnn_end2end.sh' file is constructed? Thanks for your help in advance.

    ReplyDelete
    Replies
    1. I guess things have changed. For reference this is the "sgs_train_faster_rcnn_end2end.sh" that I used for training: http://pastebin.com/K74Rvv9N

      About .yml file, you can look at faster_rcnn_end2end.yml to see its format.

      Hope it helps.

      Delete
    2. Thanks for your feedback.

      I checked your link and I found that it is same with mine even though mine is not operating. :(

      I got the error about the roidb.py like below.

      ''''''
      Traceback (most recent call last):
      File "./tools/train_net.py", line 104, in
      imdb, roidb = combined_roidb(args.imdb_name)
      File "./tools/train_net.py", line 69, in combined_roidb
      roidbs = [get_roidb(s) for s in imdb_names.split('+')]
      File "./tools/train_net.py", line 66, in get_roidb
      roidb = get_training_roidb(imdb)
      File "/home/irobot/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 122, in get_training_roidb
      rdl_roidb.prepare_roidb(imdb)
      File "/home/irobot/py-faster-rcnn/tools/../lib/roi_data_layer/roidb.py", line 27, in prepare_roidb
      roidb[i]['image'] = imdb.image_path_at(i)
      IndexError: list index out of range
      ''''''

      Do you have any advices about this? I already tried to delete the cache folder, but it doesn't work. Thanks again for your help.

      Delete
    3. Oh, I solved this problem. The flip function was the problem.
      Thank you for your good post!

      Delete
    4. Now I got another problem. I got the floating point exception error, but it is not resolved by lowering the learning rate, make the RNG.SEED larger, and implementing the filter_roidb.

      So now I think that the problem is maybe from the dataset. My images of dataset have all 1280 x 1024 size. Is it too large to deal with using VGG?

      Delete
  13. This comment has been removed by the author.

    ReplyDelete
  14. Hi.. I am trying to traing FRCNN on a custom adatset. But im getting the following error when i give for training.
    Traceback (most recent call last):
    File "./tools/train_faster_rcnn_alt_opt.py", line 19, in
    from datasets.factory import get_imdb
    File "/root/deep-learning/FRCNN/py-faster-rcnn/tools/../lib/datasets/factory.py", line 15, in
    from datasets.fishclassify import fishclassify
    File "/root/deep-learning/FRCNN/py-faster-rcnn/tools/../lib/datasets/fishclassify.py", line 25, in
    class fishclassify(datasets.imdb):
    TypeError: Error when calling the metaclass bases
    module.__init__() takes at most 2 arguments (3 given)

    Any help would be helpful.

    ReplyDelete
  15. How can I train FRCNN in CPU mode?

    ReplyDelete
    Replies
    1. How about "caffe.set_mode_cpu();" ?

      Delete
    2. i did that modification in train_net.py (caffe.set_mode_cpu()).

      I successfully executed demo.py. But i am not able to train my own dataset.
      ./tools/train_faster_rcnn_alt_opt.py --net_name fish_classify --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel
      Called with args:
      Namespace(cfg_file=None, gpu_id=0, imdb_name='voc_2007_trainval', net_name='fish_classify', pretrained_model='data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel', set_cfgs=None)
      Stage 1 RPN, init from ImageNet model
      Init model: data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel
      Using config:
      {'DATA_DIR': '/root/deep-learning/FRCNN/py-faster-rcnn/data',
      'DEDUP_BOXES': 0.0625,
      'EPS': 1e-14,
      'EXP_DIR': 'default',
      'GPU_ID': 0,
      'MATLAB': 'matlab',
      'MODELS_DIR': '/root/deep-learning/FRCNN/py-faster-rcnn/models/pascal_voc',
      'PIXEL_MEANS': array([[[ 102.9801, 115.9465, 122.7717]]]),
      'RNG_SEED': 3,
      'ROOT_DIR': '/root/deep-learning/FRCNN/py-faster-rcnn',
      'TEST': {'BBOX_REG': True,
      'HAS_RPN': False,
      'MAX_SIZE': 1000,
      'NMS': 0.3,
      'PROPOSAL_METHOD': 'selective_search',
      'RPN_MIN_SIZE': 16,
      'RPN_NMS_THRESH': 0.7,
      'RPN_POST_NMS_TOP_N': 300,
      'RPN_PRE_NMS_TOP_N': 6000,
      'SCALES': [600],
      'SVM': False},
      'TRAIN': {'ASPECT_GROUPING': True,
      'BATCH_SIZE': 128,
      'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
      'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
      'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
      'BBOX_NORMALIZE_TARGETS': True,
      'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': False,
      'BBOX_REG': False,
      'BBOX_THRESH': 0.5,
      'BG_THRESH_HI': 0.5,
      'BG_THRESH_LO': 0.1,
      'FG_FRACTION': 0.25,
      'FG_THRESH': 0.5,
      'HAS_RPN': True,
      'IMS_PER_BATCH': 1,
      'MAX_SIZE': 1000,
      'PROPOSAL_METHOD': 'gt',
      'RPN_BATCHSIZE': 256,
      'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
      'RPN_CLOBBER_POSITIVES': False,
      'RPN_FG_FRACTION': 0.5,
      'RPN_MIN_SIZE': 16,
      'RPN_NEGATIVE_OVERLAP': 0.3,
      'RPN_NMS_THRESH': 0.7,
      'RPN_POSITIVE_OVERLAP': 0.7,
      'RPN_POSITIVE_WEIGHT': -1.0,
      'RPN_POST_NMS_TOP_N': 2000,
      'RPN_PRE_NMS_TOP_N': 12000,
      'SCALES': [600],
      'SNAPSHOT_INFIX': 'stage1',
      'SNAPSHOT_ITERS': 10000,
      'USE_FLIPPED': True,
      'USE_PREFETCH': False},
      'USE_GPU_NMS': False}
      WARNING: Logging before InitGoogleLogging() is written to STDERR
      F0608 16:24:47.054726 7419 common.cpp:66] Cannot use GPU in CPU-only Caffe: check mode.
      Check failure stack trace:

      Delete
    3. Maybe there are some places where Caffe is set using GPU in the train codes.

      Delete
    4. This is that one. I think you should remove the GPU_ID if you want to use CPU only.

      Called with args:
      Namespace(cfg_file=None, gpu_id=0, imdb_name='voc_2007_trainval', net_name='fish_classify', pretrained_model='data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel', set_cfgs=None)
      Stage 1 RPN, init from ImageNet model
      Init model: data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel
      Using config:
      {'DATA_DIR': '/root/deep-learning/FRCNN/py-faster-rcnn/data',
      'DEDUP_BOXES': 0.0625,
      'EPS': 1e-14,
      'EXP_DIR': 'default',
      'GPU_ID': 0,

      Delete
    5. Thank you for your reply.
      Can i simply remove that? In train_faster_rcnn_alt_opt.py it is listed in the param that needs to be passed. Inside the code i have seen other places where the GPU ID is set. So if i simple changed the param whether it will affect any other function calls? I have seen posts on compiling and running in cpu mode, but didnt see any successful post on training FRCNN in cpu mode. Can you provide any info?

      Delete
    6. Actually, I haven't tried to train Faster RCNN using CPU

      Delete
    7. Ok...thats sad news for me...thanks for your reply

      Delete
    8. Hi,
      As i was stuck with the above problem, i started to train faster RCNN as per https://github.com/deboc/py-faster-rcnn/tree/master/help

      When i try to train using the command
      ./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name fishclassify --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb fishclassify --cfg /home/fast-rcnn/py-faster-rcnn/config.yml

      I am getting the following error
      File "/fast-rcnn/py-faster-rcnn/tools/../lib/datasets/factory.py", line 46, in get_imdb
      raise KeyError('Unknown dataset: {}'.format(name))

      Any help on what might be causing this?
      I had created fishclassify.py and fishclassify_eval.py under lib/datasets as well as modified factory.py

      When i open the .py files in pycharm it shows unresolved import for "import datasets". Any idea where i had gone wrong?

      Delete
  16. Apparently it is not possible to train with CPU, without changing faster r-cnn implementation.
    One of the layers is only implemented for use with GPU.

    @sgsai what are the overlaps in:
    overlaps[ix, cls] = 1.0

    located @ _load_sgs_annotation from your sgsai.py reference example.

    I get the following error:
    'IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

    ReplyDelete
    Replies
    1. sorry I wasn't aware of this comment. Try changing "overlaps[ix, cls] = 1.0" to overlaps[ix, int(cls)] = 1.0

      Delete
  17. Hi - Thanks for this post. I was trying to set up a work flow to train my own images. I have followed your steps and stuck at a place. Could you please suggest a solution for the below error ?

    Loaded dataset `train` for training
    Set proposal method: gt
    Appending horizontally-flipped training examples...
    Traceback (most recent call last):
    File "./tools/train_net.py", line 104, in
    imdb, roidb = combined_roidb(args.imdb_name)
    File "./tools/train_net.py", line 69, in combined_roidb
    roidbs = [get_roidb(s) for s in imdb_names.split('+')]
    File "./tools/train_net.py", line 66, in get_roidb
    roidb = get_training_roidb(imdb)
    File "/home/ubuntu/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 120, in get_training_roidb
    imdb.append_flipped_images()
    File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 106, in append_flipped_images
    boxes = self.roidb[i]['boxes'].copy()
    File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 67, in roidb
    self._roidb = self.roidb_handler()
    File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/sgs.py", line 83, in gt_roidb
    for index in self.image_index]
    File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/sgs.py", line 198, in _load_sgs_annotation
    overlaps[ix, cls] = 1.0
    IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

    ReplyDelete
    Replies
    1. I don't have this error. However, a quick fix would be changing "overlaps[ix, cls] = 1.0" to overlaps[ix, int(cls)] = 1.0

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Thanks a lot !! I figured that out later. I got into another issue though. I have set up the same directory structure as suggested by you with my own sample data set. For some reasons voc dataset is picked for training

      Loaded dataset `voc_2007_trainval` for training

      and then I get the error:

      Loaded dataset `voc_2007_trainval` for training
      Set proposal method: selective_search
      Appending horizontally-flipped training examples...
      voc_2007_trainval gt roidb loaded from /home/ubuntu/py-faster-rcnn/data/cache/voc_2007_trainval_gt_roidb.pkl
      Traceback (most recent call last):
      File "./tools/train_net.py", line 104, in
      imdb, roidb = combined_roidb(args.imdb_name)
      File "./tools/train_net.py", line 69, in combined_roidb
      roidbs = [get_roidb(s) for s in imdb_names.split('+')]
      File "./tools/train_net.py", line 66, in get_roidb
      roidb = get_training_roidb(imdb)
      File "/home/ubuntu/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 120, in get_training_roidb
      imdb.append_flipped_images()
      File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 106, in append_flipped_images
      boxes = self.roidb[i]['boxes'].copy()
      File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/imdb.py", line 67, in roidb
      self._roidb = self.roidb_handler()
      File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 132, in selective_search_roidb
      ss_roidb = self._load_selective_search_roidb(gt_roidb)
      File "/home/ubuntu/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py", line 166, in _load_selective_search_roidb
      'Selective search data not found at: {}'.format(filename)
      AssertionError: Selective search data not found at: /home/ubuntu/py-faster-rcnn/data/selective_search_data/voc_2007_trainval.mat

      How do I make sure it picks my dataset ?..

      I would really appreciate your input

      Delete
    4. You should modify the factory.py in the same directory. For example, to add SGS dataset, we should do

      sgs_devkit_path = '/home/abc/Datasets/SGS'
      for split in ['train', 'test']:
      name = '{}_{}'.format('sgs', split)
      __sets[name] = (lambda split=split: datasets.sgs(split, sgs_devkit_path))

      Delete