Screw Detection
Data Preparation
Statistics of Pascal VOC2012 Dataset
- number of classes = 20
- number of images = 11,540
- number of objects = 27,450
- average number of images / class = 577
Raw Dataset
- number of classes: 2 -- "goodscrew", "badscrew"
- number of images: 94 + 335 = 429
- number of objects: TO BE CALCULATED
Data Augmentation
random translation => add noise => rotate;
- number of images: 429 x 2^3 = 3,432
Dataset Preparation for Training
Reference on training my own dataset.
- Write a
basketball.py
within$ROOT_DIR/lib/datasets
, likepascal_voc.py
; - Write
basketball_eval.py
within$ROOT_DIR/lib/datsets
just likepascal_eval.py
;
Training Faster-RCNN (PyTorch)
The project is from faster-rcnn.pytorch.
[ ISSUE 1 ] => bbox
coordinates overflow
When loading annotations files to assign box coordinates to ground truth boxes -- gt_boxes
, if any coordinate value is 0, after minus 1 operation, the value would overflow to 65,534. The reason is coordinates are of type numpy.uint16
.
note 2^16 = 65,536
[ Approach to Solve ISSUE 1 ]
In ROOT_DIR/lib/datasets/basketball.py
(taking basketball
dataset as an example ), find basketball._load_annotation
method, clamp coordinate values -- x1, y1, x2, y2
to ZERO. And delete the cache files under ROOT_DIR/data/cache/
.
Training Faster-RCNN (Caffe)
Tutorial from this blog.
Tutorial
Step 1
Use your own Annotations, ImageSets
and JPEGImages
to replace those in ROOT_DIR/data/VOCdevkit2007/VOC2007
.
Step 2
Go to ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_rpn_train.pt
ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt/faster_rcnn_test.pt
and modify num_classes
and num_output
.
Step 3
Go to ROOT_DIR/lib/datasets/pascal_voc.py
and take care of self._classes
, self_data_path
.
Step 4
Go to ROOT_DIR/lib/datasets/imdb.py
, then pay attention to method append_flipped_images
and the associated AssertionError
.
Step 5
- Move trained models in
output
directory to somewhere else. - Delete cache files in
ROOT_DIR/data/cache/
andROOT_DIR/data/VOCdevkit2007/annotations_cache
.
Step 6
Set learning rate
and other parameters in solver
files within ROOT_DIR/models/pascal_voc/ZF/faster_rcnn_alt_opt
.
Start training:
$ ./experiment/scripts/faster_rcnn_alt_opt.sh ZF pascal_voc
Things Worth Mentioning
BBOX Coordinates Index
Since dataset may have conflicts with original PASCAL VOC dataset. For example, ImageNet start with index 0 in row and column while PASCAL VOC dataset starts with index 1.
e.g. In the basketball dataset, quite a few xmin
and ymin
equal zero.
Solution Reference from issue.
Go to "Annotation" directory, then
$ grep -rl '<ymin>0</ymin>' ./ | xargs sed -i 's#<ymin>0</ymin>#<ymin>1</ymin>#g'
$ grep -rl '<xmin>0</xmin>' ./ | xargs sed -i 's#<xmin>0</xmin>#<xmin>1</xmin>#g'
Records of the First Screw Detection Network
Location model proto files
ROOT_DIR/models/pascal_voc/
Location of trained models
ROOT_DIR/output-det-1/faster_rcnn_alt_opt/voc_2007_trainlval/ZF_faster_rcnn_final.caffemodel
Run demo
The demo program (including object detection and classification) is in ROOT_DIR/tools/screw_demo.py
.
Run it using:
$ ./tools/screw_demo.py --net zf