@mShuaiZhao 2018-01-17T07:17:45.000000Z 字数 1325 阅读 349

SSD

PaperReading TextDetection 2017.11

Introduction
- 消除了bounding box proposals and the subsequent pixel or feature resampling stage.
  using a small convolutional filter to predict object categories and offsets in bounding box locations
The Single Shot Detector(SSD)
- Model
  - 在不同尺度的特征图上进行检测。
  - convolution predictors for detection
    对于一个 $m \times n \ with \ p(channels)$ 的feature map，利用一个 $3 \times 3 \times p$ 的small kernel去检测。
  - Default boxed and aspect tatios
    对每个location $\Delta(cx,cy,w,h)$ , 评估不同aspect ratio的default boxes.
    对每个default box,预测shape offsets和对每个object的confidence.
    Loss是Localization loss和confidence loss的加权。
    
    对于一个给定位置(location)的default box,计算c个类别的score,4个和default box相关的offsets。对于 $k$ 个不同的预测,就需要 $(c+4)\times k$ 个fiters.
    对于 $m \times n$ 大小的feature map,就会产生 $(c+4)kmn$ 个输出。
    一个default box和Faster RCNN中的anchor boxes比较像。
Training
与传统的region proposal不同的，ground truth根据输出形式要重新计算
- Matching strategy
  Jaccard overlap
- Training objective
  localization loss 和confidence loss的加权和
- Choosing scales and aspect ratios for default boxes
  在哪个尺度的特征图上进行检测(lower layers capture more fine details of the input objects)
  不同尺度的特征图，使用不同尺度大小的default boxes
  同一特征图，使用不同aspect ratios的default boxes
- Hard Negative mining
  大多数default boxes都是负样本,样本不均衡问题
  用每个default box的confidence loss大小排序，选择top ones,使得负样本和正样本之比最多为3:1
- Data augmentation
  various input object sizes
  输入整个图像，图像的一个patch（具有minimum jaccard overlap）,随机选取图像的一个patch
  resiz到固定大小，水平翻转
Related Work
- two eastablished methodes
  one based on siding windows
  the other based on region proposal

SSD

内容目录