DeepLab-v3+

最新推荐文章于 2023-08-01 16:42:53 发布

只星若晨

最新推荐文章于 2023-08-01 16:42:53 发布

阅读量554

点赞数

CC 4.0 BY-SA版权

分类专栏： Deeplab系列文章标签：卷积网络深度学习卷积神经网络计算机视觉

本文链接：https://blog.csdn.net/wei2013303336/article/details/118733383

Deeplab系列专栏收录该内容

7 篇文章

订阅专栏

一、待解决的问题

网络连续下采样和重复池化，导致输入特征图分辨率较低
空间不变性，丢失大量细节信息
物体多尺度问题

二、创新点

引入编解码结构Encoder-Decoder，v3作为Encoder，添加Decoder作为输出处理过程，优化边缘
引入Xception和Depthwise separable convolution，应用到ASPP和Deconder过程
修改了Xception，使用strider=2的Depthwise separable convolution代替所有的Maxpooling

三、具体细节

1.Encoder-Decoder编解码结构

图a是v3的网络结构，图b则是典型的编解码网络结构，图c则是本文提出的，基于v3作为Encoder的编解码分割网络。分割网络的Decoder有两个输入：ASPP的输出和Encoder的中间特征图（Resnet中间输出）。

具体的网络结构如下图所示：
在这里插入图片描述

class _ASPP(nn.Module):
    """
    Atrous spatial pyramid pooling with image-level feature
    """

    def __init__(self, in_ch, out_ch, rates):
        super(_ASPP, self).__init__()
        self.stages = nn.Module()
        self.stages.add_module("c0", _ConvBnReLU(in_ch, out_ch, 1, 1, 0, 1))
        for i, rate in enumerate(rates):
            self.stages.add_module(
                "c{}".format(i + 1),
                _ConvBnReLU(in_ch, out_ch, 3, 1, padding=rate, dilation=rate),
            )
        self.stages.add_module("imagepool", _ImagePool(in_ch, out_ch))

    def forward(self, x):
        return torch.cat([stage(x) for stage in self.stages.children()], dim=1)


class DeepLabV3Plus(nn.Module):
    """
    DeepLab v3+: Dilated ResNet with multi-grid + improved ASPP + decoder
                 Dilated Resnet with multi-grid + improved ASPP与DeepLab V3相同
                 decoder则是两次卷积+4倍上采样
                 注意 Renset layer2的输出在送入decoder前对num_channels进行了reduce(256->48)
    """

    def __init__(self, n_classes, n_blocks, atrous_rates, multi_grids, output_stride):
        super(DeepLabV3Plus, self).__init__()

        # Stride and dilation
        if output_stride == 8:
            s = [1, 2, 1, 1]
            d = [1, 1, 2, 4]
        elif output_stride == 16:
            s = [1, 2, 2, 1]
            d = [1, 1, 1, 2]

        # Encoder
        ch = [64 * 2 ** p for p in range(6)]
        self.layer1 = _Stem(ch[0])
        self.layer2 = _ResLayer(n_blocks[0], ch[0], ch[2], s[0], d[0])
        self.layer3 = _ResLayer(n_blocks[1], ch[2], ch[3], s[1], d[1])
        self.layer4 = _ResLayer(n_blocks[2], ch[3], ch[4], s[2], d[2])
        self.layer5 = _ResLayer(n_blocks[3], ch[4], ch[5], s[3], d[3], multi_grids)
        self.aspp = _ASPP(ch[5], 256, atrous_rates)
        concat_ch = 256 * (len(atrous_rates) + 2)
        self.add_module("fc1", _ConvBnReLU(concat_ch, 256, 1, 1, 0, 1))

        # Decoder
        self.reduce = _ConvBnReLU(256, 48, 1, 1, 0, 1)
        self.fc2 = nn.Sequential(
            OrderedDict(
                [
                    ("conv1", _ConvBnReLU(304, 256, 3, 1, 1, 1)),
                    ("conv2", _ConvBnReLU(256, 256, 3, 1, 1, 1)),
                    ("conv3", nn.Conv2d(256, n_classes, kernel_size=1)),
                ]
            )
        )

    def forward(self, x):
        h = self.layer1(x)
        h = self.layer2(h)
        h_ = self.reduce(h)
        h = self.layer3(h)
        h = self.layer4(h)
        h = self.layer5(h)
        h = self.aspp(h)
        h = self.fc1(h)
        h = F.interpolate(h, size=h_.shape[2:], mode="bilinear", align_corners=False)
        h = torch.cat((h, h_), dim=1)
        h = self.fc2(h)
        h = F.interpolate(h, size=x.shape[2:], mode="bilinear", align_corners=False)
        return h

2.Depthwise separable convolution深度可分离卷积（空洞）

DeepLab-v3还引入Xception网络结构和深度可分离卷积。深度可分离卷积将普通卷积分为两个过程 depth-wise convolution 和 point-wise convolution。先使用K*K的卷积核对每个通道单独进行卷积，不改变通道数两；再使用1*1的卷积对前述特征进行融合，并进行通道数增减。
对卷积核大小为K的普通卷积来说，参数量为：
$C_{in}*K*K*C_{out}$
若使用深度可分离卷积，则参数量为：
$C_{in}*k*k + C_{out}*1*1$

Pytorch实现深度可分离卷积代码：

class Sep_conv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(Sep_conv, self).__init__()
        self.depth_conv = nn.Conv2d(
            in_channels=in_ch,
            out_channels=in_ch,
            kernel_size=3,
            stride=1,
            padding=1,
            groups=in_ch # 每个channel作为一组
        )
        self.point_conv = nn.Conv2d(
            in_channels=in_ch,
            out_channels=out_ch,
            kernel_size=1,
            stride=1,
            padding=0,
            groups=1
        )

    def forward(self, input):
        out = self.depth_conv(input)
        out = self.point_conv(out)
        return out

另外，分组卷积则是将特征图按照通道关系分为n组，每组进行普通卷积，则分组卷积的参数量为：
$Cin∗K∗K∗Cout3\frac{C_{in}*K*K*C_{out}}{3}$ ，pytorch代码只需要在普通卷积中添加group参数量即可。

class Group_conv(nn.Module):
    def __init__(self, in_ch, out_ch, groups):
        super(Group_conv, self).__init__()
        self.conv = nn.Conv2d(
            in_channels=in_ch,
            out_channels=out_ch,
            kernel_size=3,
            stride=1,
            padding=1,
            groups=groups
        )

    def forward(self, input):
        out = self.conv(input)
        return out