[关闭]
@haoqiang 2018-08-24T04:29:04.000000Z 字数 2507 阅读 58

Work Summary

MINIVISION -- HaoQiang


1. Outline


Watermarks Removal

1. Background

For our face recongnition task on ID card images, the most challenging problem is watermarks, due to the occlusion and quality deterioration after adding watermarks.

Pipeline:

2. pix2pix

Image-to-Image Translation with Conditional Adversarial Networks

Model

Generator (UNet)

Discriminator

Loss Function

Content Loss (L1)

Adversarial Loss

Component Loss

3. Model Compression

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

The calculation of standard conv:

The calculation of depthwise separable conv:

calculation reduction

If depthwise conv's size is set to 3 x 3 (), it will use between 8 to 9 times less computation than standard conv.

Model Model Size Speed (CPU)
Model Before Compression 122.6 MB 112ms
Model After Compression 0.63 MB 37ms

Result

Deployment

process
Train model and save weights --> Define testing graph and load weights --> Froze graph and export '.pb' file --> use opencv dnn modules by c++

  1. net = dnn::readNetFromTensorflow(model);
  2. imputBlob = blobFromImage(img);
  3. output = net.forward("generator/tanh");

Notice:


Super Resolution

1. Background

Because of the low resolution and quality decrease of images after compress encoding, we need to recovery clearer images to augment data and improve model performance.

2. Models

pix2pix

Generator

Discriminator (Patch GAN)

SRGAN

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Is the deconvolution layer the same as a convolutional layer?

tf.depth_to_space(x, scale=2)

Loss Function

Content Loss (perceptual)
Perceptual Losses for Real-Time Style Transfer and Super-Resolution


is the feature map of VGG19 layer .

Component Loss

Result

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注