Deep Learning

Characteristics:

end-to-end training, i.e. optimizing an objective function directly;
error does not accumulate, in comparison with traditional methods where feature engineering and classifier are optimized in separated phases;

Image Classification / Detection

Classification

Classify images according to overall semantic of images.

Traditional Image Classification Represent images using sparse encoding then train an SVM. Disadvantage is sparse encoding and SVM are optimized using two different objective functions. Therefore combined training is not an option.

Detection

Localizing objects and classify them. More attention is drawn to local areas of images.

Sub-regions selection The problem is boiled down to classification problem -- whether the area is of interest of not.

Scanning window: shared features with nearby windows; can quickly scan larger area.
Region proposal: more flexibility of handling objects with different aspect ratios and as a result, larger Intersection of Union.

Future Trends of Deep Learning

Deeper Network

More Complex Network Architecture

For example, GooLeNet use different sizes of inception units to deal with information on different scales.

NIN (Network-in-Network) Use low rank decomposition to compress convolutional layers with large amount of parameters and generalization power can be enhanced (reduce overfitting).

deep learning on image