pytorch(3)--VGG block和 Resnet block

一、前言

    本篇记录 VGG Block和Resnet Block 经典结构

二、VGGblock

    VGG Block 可由两层conv3或三层conv3组成,两层的感受野和一层conv5一样,三层conv3的感受野和一层conv7是一样的,但是能够减少计算量,以下为不同的VGGblock搭配的VGG网络

包含两层conv3的VGG block 代码如下,CBR-CBR,无池化层

class VGGBlock(nn.Module):
    def __init__(self, in_channels, middle_channels, out_channels, act_func=nn.ReLU(inplace=True)):
        super(VGGBlock, self).__init__()
        self.act_func = act_func
        self.conv1 = nn.Conv2d(in_channels, middle_channels, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(middle_channels)
        self.conv2 = nn.Conv2d(middle_channels, out_channels, 3, padding=1)
        self.bn2 = nn.BatchNorm2d(out_channels)

    def forward(self, x):
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.act_func(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.act_func(out)

        return out

三、Resnet block

  resnet 有2种网络结构

BasicBlock结构 和 BottleNeck 结构

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-K7OrudiF-1595851424230)(C:\Users\Administrator\Desktop\Resnet\7.png)]  [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-JbMevQsR-1595851424232)(C:\Users\Administrator\Desktop\Resnet\8.png)]

5种不同层数的ResNet结构图,如下所示:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yDZdjc2t-1595851424234)(C:\Users\Administrator\Desktop\Resnet\1.png)]

#代码参考自pytorch 官方 https://pytorch.org/docs/0.4.0/_modules/torchvision/models/resnet.html

import torch.nn as nn
import math
import torch


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out


class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes=1000):
        self.inplanes = 64
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        #self.avgpool = nn.AvgPool2d(7, stride=1)  
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Linear(512 * block.expansion, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)

        return x


def resnet18(pretrained=False, num_classes = 1000  ):
    """Constructs a ResNet-18 model.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = ResNet(BasicBlock, [2, 2, 2, 2] )
    if pretrained:
        model.load_state_dict( torch.load("resnet18-5c106cde.pth") )
    num_features=model.fc.in_features
    model.fc=nn.Linear(num_features,num_classes)        
    
    return model

def resnet50(pretrained=False, **kwargs):
    """Constructs a ResNet-50 model.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet50']))
    return model



def resnet101(pretrained=False, **kwargs):
    """Constructs a ResNet-101 model.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
    return model



def resnet152(pretrained=False, **kwargs):
    """Constructs a ResNet-152 model.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = ResNet(Bottleneck, [3, 8, 36, 3], **kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet152']))
    return model


model = resnet18(  pretrained = True, num_classes = 7 )

 

 

 

 

 

 

 

 

 

 

<think>嗯,用户让我介绍一下ResNet-50的原文。首先,我需要确定用户的需求是什么。ResNet-50是一个非常经典的深度残差网络,可能用户在学习计算机视觉或者深度学习,需要了解它的核心思想或者论文内容。 接下来,我要回忆一下ResNet的原始论文。ResNet是2015年由何凯明等人提出的,获得了CVPR的最佳论文奖。论文的标题是《Deep Residual Learning for Image Recognition》。ResNet-50是其中的一个具体模型,有50层深度。关键点在于残差学习,解决深度网络中的梯度消失或爆炸问题。 用户可能对论文的结构、核心贡献、实验结果以及应用场景感兴趣。需要分解这些部分,用简洁的语言解释清楚。要注意用户可能不熟悉专业术语,需要用通俗的语言,比如解释残差块的结构,跳跃连接的作用,以及为什么这样的设计有效。 另外,用户可能想知道ResNet-50其他变体(如ResNet-34、ResNet-101)的区别,或者它在实际中的应用,比如在图像分类比赛中的成绩。论文中的实验部分也很重要,比如在ImageNet上的结果,错误率降低的情况。 需要确认用户是否已经了解了一些背景知识,比如之前的VGG网络,或者残差学习的概念。如果用户是初学者,可能需要更基础的介绍,而如果是进阶学习者,可能需要更深入的技术细节,比如瓶颈结构(bottleneck)的设计,即1x1、3x3、1x1卷积层的组合。 还要注意论文中的关键图表,比如残差块的示意图,以及不同层数的ResNet结构表。可能用户需要这些图表来辅助理解,但因为是文字回复,需要用描述的方式来呈现。 最后,要提供论文的原文链接相关资源,方便用户进一步阅读。可能需要检查链接是否有效,以及推荐的资源是否合适,比如PyTorch的实现或者可视化教程。 总结一下,回答的结构应该包括论文信息、核心思想、模型架构、实验结果、应用场景延伸资源。确保每个部分都简明扼要,重点突出,帮助用户快速掌握ResNet-50的核心内容重要性。</think>好的!以下是关于ResNet-50及其原始论文的详细介绍: --- ### 📜 **论文信息** - **标题**: 《**Deep Residual Learning for Image Recognition**》 - **会议**: *CVPR 2016* (获**最佳论文奖**) - **年份**: 2015年12月提交(2016年发表) - **作者**: 何凯明(Kaiming He)、张翔宇、任少卿、Ross Girshick、孙剑 - **机构**: **Microsoft Research Asia** - **引用量**: 截至2023年超过**15万次**(计算机视觉领域里程碑式论文) --- ### 🌟 **核心思想** ResNet提出**残差学习框架**,解决深度网络训练中的**梯度消失/爆炸**问题: 1. **残差块(Residual Block)** - 核心公式: $$H(x) = F(x) + x$$ 其中$F(x)$为残差函数,通过**跳跃连接(Shortcut Connection)**直接传递原始输入 2. **深度突破** 首次成功训练**152层**网络(ResNet-152),远超越当时的VGG-19(19层) 3. **Bottleneck设计** ResNet-50采用**1x1-3x3-1x1卷积组合**,降低计算量(如左图) ![ResNet-50的Bottleneck结构](https://miro.medium.com/v2/resize:fit:1400/1*D0F3UitQ2l5Q0Ak-tjEdJg.png) --- ### 🚀 **关键贡献** 1. **残差学习理论** - 证明深层网络可通过残差映射**更易优化** - 提出"Identity Mapping"作为网络构建单元 2. **网络结构创新** - 设计ResNet-34/50/101/152系列模型 - 对比传统Plain Network,训练误差不随深度增加而上升 3. **跨领域通用性** 方法可迁移到目标检测、语义分割等任务(如Faster R-CNN + ResNet--- ### 📐 **ResNet-50架构详解** | 阶段 | 层类型 | 输出尺寸 | 重复次数 | 参数计算示例 | |--------|----------------------|----------|----------|---------------------------| | 初始层 | 7x7卷积 + MaxPool | 56x56 | 1 | 7x7x3x64 = 9,408 | | Stage1 | Bottleneck Block x3 | 56x56 | 3 | [1x1x64 + 3x3x64 + 1x1x256] x3 | | Stage2 | Bottleneck Block x4 | 28x28 | 4 | 下采样通过1x1卷积步长2 | | Stage3 | Bottleneck Block x6 | 14x14 | 6 | 共包含约**25.5M**参数 | | Stage4 | Bottleneck Block x3 | 7x7 | 3 | 最后接全局平均池化 + 全连接 | --- ### 📊 **实验结果(论文表3节选)** | 模型 | 层数 | Top-1误差(ImageNet) | 计算量(FLOPs) | |------------|------|-----------------------|-----------------| | VGG-19 | 19 | 28.7% | 19.6B | | ResNet-50 | 50 | **22.85%** | 3.8B | | ResNet-101 | 101 | 21.75% | 7.6B | | ResNet-152 | 152 | 21.43% | 11.3B | --- ### 🛠️ **应用场景** 1. **图像分类**(ILSVRC 2015冠军模型) 2. **特征提取**(迁移学习常用Backbone) 3. **目标检测**(Faster R-CNN, Mask R-CNN的基础网络) 4. **医学影像分析**(如X光片分类、病灶检测) --- ### 📚 **延伸资源** - **论文原文**: [arXiv:1512.03385](https://arxiv.org/abs/1512.03385) - **代码实现**: [PyTorch官方实现](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py) - **可视化解析**: [ResNet论文逐段精读](https://www.bilibili.com/video/BV1P3411y7nn) - **实用技巧**: 现代实现中常添加**SE模块**(SENet)或**注意力机制**改进原始结构 ResNet系列开启了**超深度网络**的新时代,其残差思想深刻影响了后续的DenseNet、MobileNet等模型设计,是深度学习发展史上的重要转折点。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值