Screw Detection

Data Preparation

Statistics of Pascal VOC2012 Dataset

  • number of classes = 20
  • number of images = 11,540
  • number of objects = 27,450
  • average number of images / class = 577

Raw Dataset

  • number of classes: 2 -- "goodscrew", "badscrew"
  • number of images: 94 + 335 = 429
  • number of objects: TO BE CALCULATED

Data Augmentation

random translation => add noise => rotate;

  • number of images: 429 x 2^3 = 3,432

Dataset Preparation for Training

Reference on training my own dataset.

  • Write a basketball.py within $ROOT_DIR/lib/datasets, like pascal_voc.py;
  • Write basketball_eval.py within $ROOT_DIR/lib/datsets just like pascal_eval.py;

Training Faster-RCNN (PyTorch)

The project is from faster-rcnn.pytorch.

[ ISSUE 1 ] => bbox coordinates overflow When loading annotations files to assign box coordinates to ground truth boxes -- gt_boxes, if any coordinate value is 0, after minus 1 operation, the value would overflow to 65,534. The reason is coordinates are of type numpy.uint16.

note 2^16 = 65,536

[ Approach to Solve ISSUE 1 ]

In ROOT_DIR/lib/datasets/basketball.py (taking basketball dataset as an example ), find basketball._load_annotation method, clamp coordinate values -- x1, y1, x2, y2 to ZERO. And delete the cache files under ROOT_DIR/data/cache/.

Training Faster-RCNN (Caffe)

Tutorial from this blog.

Tutorial

Step 1

Use your own Annotations, ImageSets and JPEGImages to replace those in ROOT_DIR/data/VOCdevkit2007/VOC2007.

Step 2

Go to ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt

ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt

ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt

ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt

ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/faster_rcnn_test.pt

and modify num_classes and num_output.

Step 3

Go to ROOT_DIR/lib/datasets/pascal_voc.py and take care of self._classes, self_data_path.

Step 4

Go to ROOT_DIR/lib/datasets/imdb.py, then pay attention to method append_flipped_images and the associated AssertionError.

Step 5
  • Move trained models in output directory to somewhere else.
  • Delete cache files in ROOT_DIR/data/cache/ and ROOT_DIR/data/VOCdevkit2007/annotations_cache.
Step 6

Set learning rate and other parameters in solverfiles within ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt.

Start training:

$ ./experiment/scripts/faster_rcnn_alt_opt.sh ZF pascal_voc

Things Worth Mentioning

BBOX Coordinates Index

Since dataset may have conflicts with original PASCAL VOC dataset. For example, ImageNet start with index 0 in row and column while PASCAL VOC dataset starts with index 1.

e.g. In the basketball dataset, quite a few xmin and ymin equal zero.

Solution Reference from issue.

Go to "Annotation" directory, then

$ grep -rl '<ymin>0</ymin>' ./ | xargs sed -i 's#<ymin>0</ymin>#<ymin>1</ymin>#g'
$ grep -rl '<xmin>0</xmin>' ./ | xargs sed -i 's#<xmin>0</xmin>#<xmin>1</xmin>#g'

Records of the First Screw Detection Network

Location model proto files ROOT_DIR/models/pascal_voc/

Location of trained models ROOT_DIR/output-det-1/faster_rcnn_alt_opt/voc_2007_trainlval/ZF_faster_rcnn_final.caffemodel

Run demo

The demo program (including object detection and classification) is in ROOT_DIR/tools/screw_demo.py.

Run it using:

$ ./tools/screw_demo.py --net zf

results matching ""

    No results matching ""