@mShuaiZhao
2018-01-04T11:21:00.000000Z
字数 1954
阅读 418
CNN
2017.12
What are localization and detection?
Defining the target label
background
landmark detection
e.g.人脸的landmark检测,可以设计输出如下
一个output代表存在人脸的概率
人脸的landmark()
pose detection
key points in the person's body
sliding windows
car detection example
先利用一些小图片,只含有车或不含有,来训练一个分类器。
然后利用sliding windows detection。
sliding windows detection
用不同大小的windows在图像中滑动,判断该小块区域是否是要检测的目标。
[Sermanet et al, 2014, OverFeat: Integrated recognition, localization and detection using convolutional networls]
Turning FC layer into convolutional layers
Kernel的大小和input volume的宽高一样。
Convolution Implementation of sliding windows
Output accurate bounding boxes
YOLO algorithm
[Redmon et al, 2015, You Only Loo Once: Unified real-time object detection]
place a grid on the image 将图像分成几个栅格
e.g. 将100x100的图像分成3x3的9个格子
Labels for training
最后网络的输出就是3x3x8的格式。
中心在哪个栅格内,就判定object在哪个栅格
可以采用更多的栅格划分使得objects的中心点落在同一个栅格内的概率更小
Specify the bounding boxes
Evaluating object localization
IOU = ( size of intersection) / (size of union)
两个box的合集叫union,intersection就是二者的交集。
Usually, the answer is right if 。
question
要是groundtruth box 围住了predicted box呢?
you may find multiple detections of the same objects
Non-max supperssion example
clean up the redundant detections
选择一个confidence最大的bounding box然后去除掉与其的IOU大于一定阈值的bounding box,最后剩下的就是检测到的最终结果。
One grid cell can only detect one object
overlapping objects
predefined different shapes called anchor boxes
Anchor box algorithm
Training
Prediction
NMS
Regions with CNN
segmentation algorithm
选择一些较为合理的区域。
a little bit slow