Face Identification / Verification / Embeddings ...

Probabilistic Face Embeddings

by Yichu Shi, Anil K. Jain el ta.

Idea

Represent face features as distributional estimation instead of point estimation, i.e. Probabilistic Face Embeddings.

  1. Uncertainty Learning

  2. Probabilistic Face Representation

  3. Quality-Aware Pooling

  4. Variance Protocols of Benchmarks, for example,

Protocols of IJBS-S:

  • surveillance-to-single;
  • surveillance-to-booking;
  • surveillance-to-surveillance;
  • surveillance-to-still;

Papers to Read

  • Face Recognition with Image Sets using Manifold Density Divergence

    low-dimensional manifolds, statistical formulation

  • Face Recognition from Long-Term Observations

    Use a set of images to represent an identity

  • Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification

  • Video Face Recognition: Component-wise Feature Aggregation Network (C-FAN)

  • Neural Aggregation Network for Video Face Recognition

Thoughts and TODOs

  1. TODO: Calculate cosine similarity between a high-quality face image and its degraded versions (Gaussian blur, motion blur and other currently used data augmentations), and similarities of degraded images of different identities.

    • false accept of imposter low-quality pairs;
    • false reject of genuine cross-quality pairs;
  2. Thought: It is necessary for network to see same samples with its degraded versions (after data augmentation) to address the second issue.

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

by Xiao Zhang, et al.

Idea

Margin-s (scale) and margin-m are the two most influential parameters in margin-based softmax method. Setting these parameters wisely is vital to effective training. The idea is to span softmax probability wide within the range of (0, 1) while generating substantial gradients when cosine similarities are not close enough.

Effects of Margin-s and Margin-m

  • Margin-s

Scale needs to be sufficiently large so that softmax probability can reach the value of 1. Too large margin-s, however, will degenerate network's performance since it will not generate noticeable gradients even if cosine similarity is not close enough.

  • Margin-m

The value of m determines where (how small the angle is) softmax probability stops being zero.

New Reflections and Takeaways

  1. Vanilla softmax and cross-entropy loss does not optimize cosine similarity explicitly. Margin-based softmax however, optimizes cosine distance directly.

  2. Idealy, P_ij the probability should gradually increase from 0 to 1 when Theta_ij decreases from pi/2 to 0. Would a more linear relationship help? And what is so special about softmax and others -- tanh, sigmoid etc.

  3. Margin-based softmax (arcface) has the explicit optimization goal to reduce intra-class variance, while only implicitly increase inter-class variance.

Learning Discriminative Features via Weights-biased Softmax loss

The loss is designed to increase inter-class variance.

DocFace+: ID Document to Selfie Matching

Dynamic Weight Imprinting

Problem: due to SGD optimizer, in the case of two-shot classification, each weight vector receives attraction signals only twice per epoch. These sparse attraction signals make little difference to the classifier weights which causes underfitting of classifier weights.

Dynamic Weight Imprinting is to replace weight vector with feature in current mini-batch?

Data Sampling

Sample ID-selfie pair during training.

Parameter Sharing for Domain-Specific Representation

How does parameter sharing work?

Other Topics of Interest

  • Heterogeneous Face Recognition
  • Low-Shot Learning
    • meta-learning, producing new classifiers
  • Study of Weight Imprinting

results matching ""

    No results matching ""