Screw Detection
Data Preparation
Statistics of Pascal VOC2012 Dataset
- number of classes = 20
- number of images = 11,540
- number of objects = 27,450
- average number of images / class = 577
Raw Dataset
- number of classes: 2 -- "goodscrew", "badscrew"
- number of images: 94 + 335 = 429
- number of objects: TO BE CALCULATED
Data Augmentation
random translation => add noise => rotate;
- number of images: 429 x 2^3 = 3,432
Dataset Preparation for Training
Reference on training my own dataset.
- Write a
basketball.pywithin$ROOT_DIR/lib/datasets, likepascal_voc.py; - Write
basketball_eval.pywithin$ROOT_DIR/lib/datsetsjust likepascal_eval.py;
Training Faster-RCNN (PyTorch)
The project is from faster-rcnn.pytorch.
[ ISSUE 1 ] => bbox coordinates overflow
When loading annotations files to assign box coordinates to ground truth boxes -- gt_boxes, if any coordinate value is 0, after minus 1 operation, the value would overflow to 65,534. The reason is coordinates are of type numpy.uint16.
note 2^16 = 65,536
[ Approach to Solve ISSUE 1 ]
In ROOT_DIR/lib/datasets/basketball.py (taking basketball dataset as an example ), find basketball._load_annotation method, clamp coordinate values -- x1, y1, x2, y2 to ZERO. And delete the cache files under ROOT_DIR/data/cache/.
Training Faster-RCNN (Caffe)
Tutorial from this blog.
Tutorial
Step 1
Use your own Annotations, ImageSets and JPEGImages to replace those in ROOT_DIR/data/VOCdevkit2007/VOC2007.
Step 2
Go to ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/faster_rcnn_test.pt
and modify num_classes and num_output.
Step 3
Go to ROOT_DIR/lib/datasets/pascal_voc.py and take care of self._classes, self_data_path.
Step 4
Go to ROOT_DIR/lib/datasets/imdb.py, then pay attention to method append_flipped_images and the associated AssertionError.
Step 5
- Move trained models in
outputdirectory to somewhere else. - Delete cache files in
ROOT_DIR/data/cache/andROOT_DIR/data/VOCdevkit2007/annotations_cache.
Step 6
Set learning rate and other parameters in solverfiles within ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt.
Start training:
$ ./experiment/scripts/faster_rcnn_alt_opt.sh ZF pascal_voc
Things Worth Mentioning
BBOX Coordinates Index
Since dataset may have conflicts with original PASCAL VOC dataset. For example, ImageNet start with index 0 in row and column while PASCAL VOC dataset starts with index 1.
e.g. In the basketball dataset, quite a few xmin and ymin equal zero.
Solution Reference from issue.
Go to "Annotation" directory, then
$ grep -rl '<ymin>0</ymin>' ./ | xargs sed -i 's#<ymin>0</ymin>#<ymin>1</ymin>#g'
$ grep -rl '<xmin>0</xmin>' ./ | xargs sed -i 's#<xmin>0</xmin>#<xmin>1</xmin>#g'
Records of the First Screw Detection Network
Location model proto files
ROOT_DIR/models/pascal_voc/
Location of trained models
ROOT_DIR/output-det-1/faster_rcnn_alt_opt/voc_2007_trainlval/ZF_faster_rcnn_final.caffemodel
Run demo
The demo program (including object detection and classification) is in ROOT_DIR/tools/screw_demo.py.
Run it using:
$ ./tools/screw_demo.py --net zf
