@mShuaiZhao 2018-01-04T11:21:00.000000Z 字数 1954 阅读 418

week03.Ng's CNN Course

CNN 2017.12

Detection Algorithms

1. Object Localization

What are localization and detection？
- Image Classification
- classification with localization
- Detection
Defining the target label $y$
1. pedestrian
2. car
3. motorcycle
4. background
  - 定义label相当于定义最后网络的输出

2. Landmark Detection

landmark detection
- e.g.人脸的landmark检测，可以设计输出如下
  
  一个output代表存在人脸的概率
  人脸的landmark( $l_{1x},l_{1y},l_{2x},l_{2y},l_{3x},l_{3y},l_{4x},l_{4y}\ldots,l_{64x},l_{64y}$ )
- pose detection
  
  key points in the person's body

3. Object Detection

sliding windows

car detection example

先利用一些小图片，只含有车或不含有，来训练一个分类器。
然后利用sliding windows detection。
sliding windows detection

用不同大小的windows在图像中滑动，判断该小块区域是否是要检测的目标。

4. Convolutional Implementation of Sliding Windows

[Sermanet et al, 2014, OverFeat: Integrated recognition, localization and detection using convolutional networls]

Turning FC layer into convolutional layers

Kernel的大小和input volume的宽高一样。
Convolution Implementation of sliding windows
- 先训练一个针对特定大小图片的classfier，全连接层用卷积来做，这样可以接受任意尺寸大小的图像输入，只是最后输出的个数不同。

5. Bounding Box Predictions

Output accurate bounding boxes
YOLO algorithm

[Redmon et al, 2015, You Only Loo Once: Unified real-time object detection]
- place a grid on the image 将图像分成几个栅格
  
  e.g. 将100x100的图像分成3x3的9个格子
- Labels for training
  
  $\begin{align*} y = \left[p_c \ b_x\ b_y \ b_h \ b_w \ c_1 \ c_2 \ c_3 \right] \end{align*}$
  
  最后网络的输出就是3x3x8的格式。
- 中心 $\left[b_x \ b_y\right]$ 在哪个栅格内，就判定object在哪个栅格
- 可以采用更多的栅格划分使得objects的中心点落在同一个栅格内的概率更小
Specify the bounding boxes

6. Intersection Over Union(IOU)

Evaluating object localization
- IOU = ( size of intersection) / (size of union)
  
  两个box的合集叫union，intersection就是二者的交集。
  Usually, the answer is right if $IOU>0.5$ 。
- question
  
  要是groundtruth box 围住了predicted box呢？

7. Non-max Suppression

you may find multiple detections of the same objects
Non-max supperssion example

clean up the redundant detections

选择一个confidence最大的bounding box然后去除掉与其的IOU大于一定阈值的bounding box，最后剩下的就是检测到的最终结果。

8. Anchor Boxes

One grid cell can only detect one object
overlapping objects

predefined different shapes called anchor boxes
Anchor box algorithm

anchor_box.png-66.5kB

9. YOLO algorithm

Training
Prediction
NMS

10. Region proposal: R-CNN

Regions with CNN
- segmentation algorithm
  选择一些较为合理的区域。
- a little bit slow

内容目录

添加新批注

在作者公开此批注前，只有你和作者可见。

私有
公开
删除

回复批注