GitHub - svellaichamy3/Style_Transfer: We take an image and add the style of another reference style image to it and give it a new look. We do this experiment inspired from ”Image Style Transfer Using Convolutional Neural Networks” (Gatys et al CVPR 2015).

Introduction

Wouldn't it be amazing to get Picasso or Van Gogh to paint your beautiful neighbourhood in their own style? Deep Learning helps us do that! We take an image and add the style of another reference style image to it and give it a new look. We do this experiment inspired from ”Image Style Transfer Using Convolutional Neural Networks” (Gatys et al CVPR 2015).

Idea

The general idea is to take two images (a content image and a style image), and produce a new image that reflects the content of one but the artistic ”style” of the other. We will do this by first formulating a loss function that matches the content and style of each respective image in the feature space of a deep network, and then performing gradient descent on the pixels of the image itself. In this project, we use SqueezeNet as our feature extractor.

We can generate an image that reflects the content of one image and the style of another by incorporating both in our loss function. We want to penalize deviations from the content of the content image and deviations from the style of the style image. We can then use this hybrid loss function to perform gradient descent not on the parameters of the model, but instead on the pixel values of our original image.

Content Loss

Content loss measures how much the feature map of the generated image differs from the feature map of the source image. We only care about the content representation of one layer of the network (say, layer l) that has feature maps A^l ϵ R^{1*C_l*H_l*W_l}. C_l is the number of channels in layer l, H_l and W_l are the height and width. We will work with reshaped versions of these feature maps that combine all spatial positions into one dimension. Let F^l ϵ R^N_l*M_l be the feature map for the current image and P^l ϵ R^N_lM_l be the feature map for the content source image where M_l = H_l W_l is the number of elements in each feature map. Each row of F^l or P^l represents the vectorized activations of a particular filter, convolved over all positions of the image. Finally, let w_c be the weight of the content loss term in the loss function. Then the content loss is given by:

Style Loss

Now we can tackle the style loss. For a given layer l, the style loss is defined as follows: First, compute the Gram matrix G which represents the correlations between the responses of each filter, where F is as above. The Gram matrix is an approximation to the covariance matrix – we want the activation statistics of our generated image to match the activation statistics of our style image, and matching the (approximate) covariance is one way to do that. There area variety of ways you could do this, but the Gram matrix is nice because it’s easy to compute and in practice shows good results. Given a feature map F^l of shape (1,C_l,M_l), the Gram matrix has shape (1,C_l,C_l) and its elements are given by:

Assuming G^l is the Gram matrix from the feature map of the current image, A^l is the Gram Matrix from the feature map of the source style image, and w_l a scalar weight term, then the style loss for the layer l is simply the weighted Euclidean distance between the two Gram matrices:

In practice we usually compute the style loss at a set of layers L rather than just a single layer l; then the total style loss is the sum of style losses at each layer:

Total Variation Loss

It turns out that it’s helpful to also encourage smoothness in the image. We can do this by adding another term to our loss that penalizes wiggles or total variation in the pixel values. This concept is widely used in many computer vision task as a regularization term. You can compute the ”total variation” as the sum of the squares of differences in the pixel values for all pairs of pixels that are next to each other (horizontally or vertically). Here we sum the total-variation regualarization for each of the 3 input channels (RGB), and weight the total summed loss by the total variation weight, wt:

Results

It was a wonderful fun experiment where we captured the style of one image and transferred it to the content of another image. Here are a few more results:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Code		Code
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Idea

Content Loss

Style Loss

Total Variation Loss

Results

About

Uh oh!

Releases

Packages

svellaichamy3/Style_Transfer

Folders and files

Latest commit

History

Repository files navigation

Introduction

Idea

Content Loss

Style Loss

Total Variation Loss

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages