【tensorflow 解析】-【4】

本文深入解析了TensorFlow中的ResNet残差网络结构,详细介绍了其调用关系及核心组件,包括ImagenetModel类、Model类的__call__方法、卷积层与池化层的实现,以及构建block层的过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

解析一个 tensorflow 项目 – residual network 残差网络结构。

resnet_model in tensorflow

resnet_model 调用关系:

1. official/resnet/imagenet_main.py :

 ImagenetModel() 类是继承 official/resnet/resnet_model.py 的 Model() 类,并初始化 __init__ 构造函数

设置的参数:

    super(ImagenetModel, self).__init__(                                           
        resnet_size=resnet_size,                                                   
        bottleneck=bottleneck,                                                     
        num_classes=num_classes,                                                   
        num_filters=64,                                                            
        kernel_size=7,                                                             
        conv_stride=2,                                                             
        first_pool_size=3,                                                         
        first_pool_stride=2,                                                       
        second_pool_size=7,                                                        
        second_pool_stride=1,                                                      
        block_sizes=_get_block_sizes(resnet_size),                                 
        block_strides=[1, 2, 2, 2],                                                
        final_size=final_size,                                                     
        version=version,                                                           
        data_format=data_format,                                                   
        dtype=dtype                                                                
    )

2. official/resnet/resnet_model.pyModel() 类:

其定义了内置函数 __call__(self, inputs, training)。使得可直接用类实例化。

3. Model() 类的内置函数 __call__()

  • 3.1 with 包含 variable_scope('resnet_model', ):
with self._model_variable_scope():
  • 3.2 第一个7x7的卷积层与池化层。根据 residual network 论文所述,先进行第一个7x7的卷积层与池化层,如图。以下代码implement 第一个卷积层:
inputs = conv2d_fixed_padding(
      inputs=inputs, filters=self.num_filters, kernel_size=self.kernel_size,
      strides=self.conv_stride, data_format=self.data_format)
  inputs = tf.identity(inputs, 'initial_conv')

__init__ 的参数对应着论文里的参数:

self.num_filters = 64,
self.kernel_size = 7,
self.conv_stride = 2,

这里需要加 tf.identity() 方法的原因是将 input 转换为一个 op, 使得后续layer 可对 input 执行操作
接着是池化层:

        inputs = tf.layers.max_pooling2d(
            inputs=inputs, pool_size=self.first_pool_size,
            strides=self.first_pool_stride, padding='SAME',
            data_format=self.data_format)
        inputs = tf.identity(inputs, 'initial_max_pool')

同样, __init__ 的参数对应着论文里的参数:

first_pool_size = 3,
first_pool_stride = 2,
  • 3.3 构建一串 block。
    从论文科可以看到,作者构建的 residual network 是由一个个的卷积 block 构成,如图,每一 block 含2个卷积层,第一组有 3 个卷积block,也就是有三个shortcut,第二组有 4 个卷积block ,第三组有 6 个卷积block,第四组有3 个卷积block。

residual network

代码中可看到,block_layer 的第一个 block_fn 与 其后的分开计算,原因是因为第一个 block_fn 需要做一次 projection_shortcut()projection_shortcut() 的作用是执行一个卷积层(kernel大小: 1x1, filter数量: x4,stride长度: 2).

  def projection_shortcut(inputs):
    return conv2d_fixed_padding(
        inputs=inputs, filters=filters_out, kernel_size=1, strides=strides,
        data_format=data_format)

  # Only the first block per block_layer uses projection_shortcut and strides
  inputs = block_fn(inputs, filters, training, projection_shortcut, strides,
                    data_format)

这时候的 block_fn()_bottleneck_block_v1() ,如下代码,当做 projection_shortcut() 时,input 会加上 shortcut,当不做 projection_shortcut() 时,input 会自加一次。论文中列举了两种 block,一种是 identify 型,一种是 bottleneck 型 ,代码展示的是做一次 “bottleneck” 型的block。如图

def _bottleneck_block_v1(inputs, filters, training, projection_shortcut,
                         strides, data_format):
  """A single block for ResNet v1, with a bottleneck.

  Similar to _building_block_v1(), except using the "bottleneck" blocks
  described in:
    Convolution then batch normalization then ReLU as described by:
      Deep Residual Learning for Image Recognition
      https://arxiv.org/pdf/1512.03385.pdf
      by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Dec 2015.

  Args:
    inputs: A tensor of size [batch, channels, height_in, width_in] or
      [batch, height_in, width_in, channels] depending on data_format.
    filters: The number of filters for the convolutions.
    training: A Boolean for whether the model is in training or inference
      mode. Needed for batch normalization.
    projection_shortcut: The function to use for projection shortcuts
      (typically a 1x1 convolution when downsampling the input).
    strides: The block's stride. If greater than 1, this block will ultimately
      downsample the input.
    data_format: The input format ('channels_last' or 'channels_first').

  Returns:
    The output tensor of the block; shape should match inputs.
  """
  shortcut = inputs

  if projection_shortcut is not None:
    shortcut = projection_shortcut(inputs)
    shortcut = batch_norm(inputs=shortcut, training=training,
                          data_format=data_format)

  inputs = conv2d_fixed_padding(
      inputs=inputs, filters=filters, kernel_size=1, strides=1,
      data_format=data_format)
  inputs = batch_norm(inputs, training, data_format)
  inputs = tf.nn.relu(inputs)

  inputs = conv2d_fixed_padding(
      inputs=inputs, filters=filters, kernel_size=3, strides=strides,
      data_format=data_format)
  inputs = batch_norm(inputs, training, data_format)
  inputs = tf.nn.relu(inputs)

  inputs = conv2d_fixed_padding(
      inputs=inputs, filters=4 * filters, kernel_size=1, strides=1,
      data_format=data_format)
  inputs = batch_norm(inputs, training, data_format)
  inputs += shortcut
  inputs = tf.nn.relu(inputs)

  return inputs

结合代码与下图可以看出,确实是做一次 “bottleneck” 型的block,
bottleneck

作者在论文中解释,在同一个block 内,input 和 output 的维数是一样的(用实线连一个shortcut),但在不同block衔接时,input 到 output 维度会增加(虚线连接一个shortcut),因此,每一个 block_layer() 的第一个 block_fn() 需要先做一个 projection_shortcut(),并使用参数规定的 strides,(如第一个block_layer的strides=1,第二个block_layer的strides=2,第三个block_layer的strides=2,第四个block_layer的strides=2),剩下的 block_fn() 不做 projection_shortcut(),并统一使用 strides=1,在如下代码中有体现。牢记这一点,后续算维度的时候才能够准确无误。

  # Only the first block per block_layer uses projection_shortcut and strides
  inputs = block_fn(inputs, filters, training, projection_shortcut, strides,
                    data_format)

  for _ in range(1, blocks):
    inputs = block_fn(inputs, filters, training, None, 1, data_format)

上述代码展示了每个 block_layer 的计算流程,第一个 block_fn 是需要计算 projection_shortcut,其余不需要计算 projection_shortcut。第一个图片中不同颜色区分的层即表示一个 block_layer,一层表示一个 block_fn()

至此,residual network 主体结构展示完毕。看看其模型的参数变化:
输入:

shape=(?, 3, 224, 224)

第一次卷积:

shape=(?, 64, 112, 112)

第一次池化:

shape=(?, 64, 56, 56)

接下来,进入shortcut 阶段:

lin:shortcut--------->  Tensor("resnet_model/batch_normalization/FusedBatchNorm:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_1:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_3/FusedBatchNorm:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:conv2d1--------->  Tensor("resnet_model/Relu_3:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_4:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_6/FusedBatchNorm:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:the 1 sub_layer--------->  Tensor("resnet_model/Relu_5:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_6:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_7:0", shape=(?, 64, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_9/FusedBatchNorm:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:the 2 sub_layer--------->  Tensor("resnet_model/Relu_8:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:the 0 blk_layer--------->  Tensor("resnet_model/block_layer1:0", shape=(?, 256, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:shortcut--------->  Tensor("resnet_model/batch_normalization_10/FusedBatchNorm:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_9:0", shape=(?, 128, 56, 56), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_10:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 2
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_13/FusedBatchNorm:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ke
rnel size: 1 strides: 1lin:conv2d1--------->  Tensor("resnet_model/Relu_12:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_13:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_16/FusedBatchNorm:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ke
rnel size: 1 strides: 1lin:the 1 sub_layer--------->  Tensor("resnet_model/Relu_14:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_15:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_16:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_19/FusedBatchNorm:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ke
rnel size: 1 strides: 1lin:the 2 sub_layer--------->  Tensor("resnet_model/Relu_17:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_18:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_19:0", shape=(?, 128, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_22/FusedBatchNorm:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ke
rnel size: 1 strides: 1lin:the 3 sub_layer--------->  Tensor("resnet_model/Relu_20:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:the 1 blk_layer--------->  Tensor("resnet_model/block_layer2:0", shape=(?, 512, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:shortcut--------->  Tensor("resnet_model/batch_normalization_23/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_21:0", shape=(?, 256, 28, 28), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_22:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 2
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_26/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:conv2d1--------->  Tensor("resnet_model/Relu_24:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_25:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_29/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:the 1 sub_layer--------->  Tensor("resnet_model/Relu_26:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_27:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_28:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_32/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:the 2 sub_layer--------->  Tensor("resnet_model/Relu_29:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_30:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_31:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_35/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:the 3 sub_layer--------->  Tensor("resnet_model/Relu_32:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_33:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_34:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_38/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:the 4 sub_layer--------->  Tensor("resnet_model/Relu_35:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_36:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_37:0", shape=(?, 256, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_41/FusedBatchNorm:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) K
ernel size: 1 strides: 1lin:the 5 sub_layer--------->  Tensor("resnet_model/Relu_38:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:the 2 blk_layer--------->  Tensor("resnet_model/block_layer3:0", shape=(?, 1024, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:shortcut--------->  Tensor("resnet_model/batch_normalization_42/FusedBatchNorm:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_39:0", shape=(?, 512, 14, 14), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_40:0", shape=(?, 512, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 2
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_45/FusedBatchNorm:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:conv2d1--------->  Tensor("resnet_model/Relu_42:0", shape=(?, 512, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_43:0", shape=(?, 512, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_48/FusedBatchNorm:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:the 1 sub_layer--------->  Tensor("resnet_model/Relu_44:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:conv2d1--------->  Tensor("resnet_model/Relu_45:0", shape=(?, 512, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 1 strides: 1
lin:conv2d2--------->  Tensor("resnet_model/Relu_46:0", shape=(?, 512, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Kernel size: 3 strides: 1
lin:conv2d3--------->  Tensor("resnet_model/batch_normalization_51/FusedBatchNorm:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0) Ker
nel size: 1 strides: 1lin:the 2 sub_layer--------->  Tensor("resnet_model/Relu_47:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0)
lin:the 3 blk_layer--------->  Tensor("resnet_model/block_layer4:0", shape=(?, 2048, 7, 7), dtype=float32, device=/replica:0/task:0/device:GPU:0)

特征图的尺寸递减,由 56, 28,14,到7。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值