YOLOv10改进教程|C2f-CIB加入注意力机制

本文章已经生成可运行项目,


  一、 导读

        论文链接:https://arxiv.org/abs/2311.11587

        代码链接:GitHub - CV-ZhangXin/AKConv

 YOLOv10训练、验证及推理教程


二、 C2f-CIB加入注意力机制

2.1 复制代码

        打开ultralytics->nn->modules->block.py文件,复制SE注意力机制(也可以自行换成别的)代码,并创建C2fCIBAttention代码,如下图所示:

class SE(nn.Module):
    def __init__(self, channel, reduction=16):
        super().__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y.expand_as(x)


class C2fCIBAttention(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, lk=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(CIB(self.c, self.c, shortcut, e=1.0, lk=lk) for _ in range(n))
        self.atten = SE(C2)

    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.atten(self.cv2(torch.cat(y, 1)))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

        并在上方声明C2fCIBAttention类。

        在nn.models.__init__.py中声明 C2fCIBAttention。

2.2 修改tasks.py 

       打开ultralytics->nn->tasks.py,如图所示操作。

​2.3 修改yolov10n.yaml

        将yolov10n.yaml文件中的C2fCIB替换为C2fCIBAttention。

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv10 object detection model. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, SCDown, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, SCDown, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 1, PSA, [1024]] # 10

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, C2f, [512]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 3, C2f, [256]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]] # cat head P4
  - [-1, 3, C2f, [512]] # 19 (P4/16-medium)

  - [-1, 1, SCDown, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]] # cat head P5
  - [-1, 3, C2fCIBAttention, [1024, True, True]] # 22 (P5/32-large)

  - [[16, 19, 22], 1, v10Detect, [nc]] # Detect(P3, P4, P5)


 2.5 修改train.py文件

        在train.py脚本中填入yolov10n.yaml路径,运行即可训练。


本文章已经生成可运行项目
### YOLOv10 中 c2f 和 c2fcib 的区别 在YOLOv10架构中,c2f(Convolutional to Feature)模块和c2fcib(Convolutional to Feature with Contextual Information Block)模块代表了两种不同的特征提取机制。 #### Convolutional to Feature (c2f) c2f是一种基础的卷积到特征转换方法,在此过程中主要依赖标准卷积操作来提取图像中的空间特征。该过程通过一系列卷积层逐步减少输入尺寸并增加通道数,从而捕捉不同层次的空间信息[^1]。 ```python def conv_to_feature(input_tensor, filters=64): x = tf.keras.layers.Conv2D(filters=filters, kernel_size=(3, 3), padding='same')(input_tensor) x = tf.keras.layers.BatchNormalization()(x) output = tf.keras.activations.relu(x) return output ``` #### Convolutional to Feature with Contextual Information Block (c2fcib) 相比之下,c2fcib不仅执行上述功能还引入了一个额外的情境信息块(Contextual Information Block),旨在增强模型对于复杂场景的理解能力。这种设计允许网络更好地处理具有挑战性的视觉模式识别任务,比如拥挤环境下的目标检测。具体来说,CIB会融合多尺度上下文线索,并将其反馈给当前层以优化局部特征表示。 ```python def conv_to_feature_with_cib(input_tensor, filters=64): base_features = tf.keras.layers.Conv2D(filters=filters, kernel_size=(3, 3), padding='same')(input_tensor) # Contextual Information Block context_info = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(base_features) context_info = tf.keras.layers.Conv2D(filters=filters//2, kernel_size=(1, 1))(context_info) context_info = tf.image.resize(context_info, size=tf.shape(base_features)[1:3]) combined = tf.concat([base_features, context_info], axis=-1) enhanced_output = tf.keras.layers.Conv2D(filters=filters, kernel_size=(1, 1))(combined) return enhanced_output ``` 这两种结构的选择取决于实际应用场景的需求以及计算资源限制等因素。通常情况下,如果追求更高的精度而不考虑效率,则应优先选用带有情境信息块的设计方案;反之则可以选择更简单的版本。
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值