[关闭]
@ArrowLLL 2018-07-27T12:23:31.000000Z 字数 4327 阅读 1638

Study-Note: Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Study-Note face-recognization


problem: Face detection and alignment

This paper proposes a new framework to integrate these two tasks using unified cascaded CNNs by multi-task learning. The proposed CNNs consist of three stages:

  1. produces candidate windows quickly through a shallow CNN
  2. refines the windows to reject a large number of non-faces windows through a more complex CNN
  3. use a more powerful CNN to refine the result and output facial landmarks positions.

Contribution
1. propose a new cascaded CNNs based frame work for joint face detection and alignment, and carefully design lightweight CNN architrcture for real time performance.
2. propose an effective method to conduct online hard sample mining to improve the performance.
3. Extensive experiments are conducted on challenging benchmarks

Approach

Given an image, we initially resize it to different scales to build an image pyramid, which is the input of the following three-stage cascaded framework.

Stage 1: P-Net(Proposal Network)

image_1cjbdnqg9tp0ijbn60rnp1nk19.png-67.1kB

  1. use manner in 《Multi-view face detection using deep convolutional neural networks》to obtain the candidate windows and their bounding box regressionvectors;

Stage 2: R-Net(Refine Net)

image_1cjbf1ito12qq15b610ko1lj9gbom.png-71.8kB

Stage 3: O-Net(Output Network)

image_1cjbf8glmk91ankgpaousqbg13.png-112kB

aim to describe the face in more details, output five facial landmarks' position

Training

leverage three tasks to train our CNN detectors, which were output of three stages' output.

Experiments

Annitation

Four different kinds of data annotation:

  1. Negatives: Regions that the Intersection-over-Union(IoU) ratio less than 0.3 to any ground-truth;
  2. Positive: IoU above 0.65 to a ground truth;
  3. Part faces: IoU between 0.4 and 0.65 to a ground-truth;
  4. Landmark faces: faces labeled 5 landmarks' positions

and

training data

WIDER FACE and CelebA

拓展阅读

stage_1的模型基础: Multi-view Face Detection Using Deep Convolutional Neural Networks
multiple CNNs for face detection: A Convolutional Neural Network Cascade for Face Detection
Online Hard sample mining: Training Region-Based Object Detectors With Online Hard Example Mining
landmark 比较对象:Facial Landmark Detection by Deep Multi-task Learning

Dataset and Benchmark
1. FDDA:FDDB: A Benchmark for Face Detection in Unconstrained Settings
2. WIDER FACE: WIDER FACE: A Face Detection Benchmark
3. AFLW: Annotated Facial Landmarks in the Wild: A Large-scale, Real-world Database for Facial Landmark Localization
4. CelebA: Deep Learning Face Attributes in the Wild

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注