项目总结四：神经风格迁移项目（Art generation with Neural Style Transfer）

最新推荐文章于 2025-04-23 11:56:35 发布

转载最新推荐文章于 2025-04-23 11:56:35 发布 · 303 阅读

文章标签：

#人工智能

本文介绍神经风格转换技术，通过合并内容图像与样式图像生成新的艺术作品。利用预训练的VGG19网络提取特征，定义内容与风格代价函数，并通过Adam优化器进行迭代优化。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1、项目介绍

神经风格转换 (NST) 是深部学习中最有趣的技术之一。它合并两个图像, 即内容图像 C（content image）和样式图像S（style image）, 以生成图像 G（generated image）。生成的图像 G 将图像 C 的内容与图像S的样式组合在一起。

2、模型

利用迁移学习的技巧，模型采用预训练的VGG19网络。预训练的模型来自 MatConvNet. http://www.vlfeat.org/matconvnet/pretrained/ 。模型结构如下：

（1）模型结构示例图：

（2）本项目用的VGG19网络的结构

{'input': <tf.Variable 'Variable:0' shape=(1, 300, 400, 3) dtype=float32_ref>,
 'conv1_1': <tf.Tensor 'Relu:0' shape=(1, 300, 400, 64) dtype=float32>, 
 'conv1_2': <tf.Tensor 'Relu_1:0' shape=(1, 300, 400, 64) dtype=float32>,
 'avgpool1': <tf.Tensor 'AvgPool:0' shape=(1, 150, 200, 64) dtype=float32>,
 'conv2_1': <tf.Tensor 'Relu_2:0' shape=(1, 150, 200, 128) dtype=float32>, 
 'conv2_2': <tf.Tensor 'Relu_3:0' shape=(1, 150, 200, 128) dtype=float32>, 
 'avgpool2': <tf.Tensor 'AvgPool_1:0' shape=(1, 75, 100, 128) dtype=float32>, 
 'conv3_1': <tf.Tensor 'Relu_4:0' shape=(1, 75, 100, 256) dtype=float32>, 
 'conv3_2': <tf.Tensor 'Relu_5:0' shape=(1, 75, 100, 256) dtype=float32>, 
 'conv3_3': <tf.Tensor 'Relu_6:0' shape=(1, 75, 100, 256) dtype=float32>, 
 'conv3_4': <tf.Tensor 'Relu_7:0' shape=(1, 75, 100, 256) dtype=float32>,
 'avgpool3': <tf.Tensor 'AvgPool_2:0' shape=(1, 38, 50, 256) dtype=float32>,
 'conv4_1': <tf.Tensor 'Relu_8:0' shape=(1, 38, 50, 512) dtype=float32>, 
 'conv4_2': <tf.Tensor 'Relu_9:0' shape=(1, 38, 50, 512) dtype=float32>, 
 'conv4_3': <tf.Tensor 'Relu_10:0' shape=(1, 38, 50, 512) dtype=float32>, 
 'conv4_4': <tf.Tensor 'Relu_11:0' shape=(1, 38, 50, 512) dtype=float32>, 
 'avgpool4': <tf.Tensor 'AvgPool_3:0' shape=(1, 19, 25, 512) dtype=float32>, 
 'conv5_1': <tf.Tensor 'Relu_12:0' shape=(1, 19, 25, 512) dtype=float32>, 
 'conv5_2': <tf.Tensor 'Relu_13:0' shape=(1, 19, 25, 512) dtype=float32>,
 'conv5_3': <tf.Tensor 'Relu_14:0' shape=(1, 19, 25, 512) dtype=float32>,
 'conv5_4': <tf.Tensor 'Relu_15:0' shape=(1, 19, 25, 512) dtype=float32>,
 'avgpool5': <tf.Tensor 'AvgPool_4:0' shape=(1, 10, 13, 512) dtype=float32>}

3、成本函数

（1）内容代价函数

首先把图片由3D volume展开为2D matrix，如下图：

计算内容代价函数。分别以G和S两图片作为输入时，如果神经网络某一层的激活值相似，那么就意味着两个图片的内容相似。

（2）风格代价函数

首先计算某一层的Gram矩阵：

计算风格代价函数。分别以G和S两图片作为输入时，如果神经网络某一层的各个通道之间激活值相关系数高，那么就意味着两个图片的内容相似。

实际上，如果你对各层都使用风格代价函数，会让结果变得更好。计算公式如下：

把内容代价函数和风格代价函数组合到一起，就得到了代价函数：

4、模型优化算法与训练目标

# define optimizer (1 line)
optimizer = tf.train.AdamOptimizer(2.0)
 
# define train_step (1 line)
train_step = optimizer.minimize(J)

5、输入输出数据

输入数据：content_image、style_image、generated_image
输出数据：generated_image

6、总结

Neural Style Transfer is an algorithm that given a content image C and a style image S can generate an artistic image
It uses representations (hidden layer activations) based on a pretrained ConvNet.
The content cost function is computed using one hidden layer's activations.
The style cost function for one layer is computed using the Gram matrix of that layer's activations. The overall style cost function is obtained using several hidden layers.
Optimizing the total cost function results in synthesizing new images.