You are watching: Perceptual losses for real-time style transfer and super-resolution
This record proposes the usage of perceptual lose functions for maintain feed-forward networks for image revolution tasks, instead of utilizing per-pixel lose functions.
Per-pixel loss functions?Comparing 2 images based on their separation, personal, instance pixel values.So, if 2 images, that space perceptually the same, however different from each other based upon even one pixel, then based upon per-pixel loss features they will be very different from each other.
Perceptual loss functions?Comparing two images based upon high-level representations from pretrained Convolutional Neural Networks (trained top top Image group tasks, speak the ImageNet Dataset).
They evaluate their strategy on two image transformation tasks:(i) format Transfer(ii) Single-Image at sight Resolution
For style transfer, castle train feed-forward networks that try to solve the optimization difficulty proposed through Gatys et al. 2015.
For at sight resolution, lock experiment v using perceptual losses, and show the it gets far better results than utilizing per-pixel loss functions.
The suggest model architecture is composed of two components:(i) Image change Network (f_w) (ii) loss Network (Φ)
Image transformation Network
The Image revolution Network is a deep residual Convolutional Neural Network i beg your pardon is trained to solve the optimization problem proposed by Gatys.
Given an input picture (x) this network transforms it right into the output picture (ŷ).
The weights the this network (W) is learnt using losses calculated using the output photo (ŷ) and comparing castle with:- the depictions of the format image (y_s) and content image (y_c), in instance of style transfer- just the content picture y_c, in situation of supervisor resolution.
The Image revolution Network is trained using Stochastic Gradient descent to acquire weights (W) that minimize the weighted amount of every the ns functions.
See more: Why Is My Mic Not Working In Overwatch Voice Chat Not Working Ps4
The ns Network(Φ) is a pretrained VGG16 top top the ImageNet Dataset.
The lose network is provided to obtain content and also style representations from the content and style images:(i) The content representation are taken from the great `relu3_3`. <Fig. 2>(ii) The layout representations room taken indigenous the class `relu1_2`, `relu2_2`, `relu3_3`and `relu4_3`. <Fig. 2>
These depictions are supplied to specify two species of losses:
Feature reconstruction Loss through the output picture (ŷ) and also the content depiction from the great `relu3_3` and using the following loss role in the image