CO-DETR利用coco数据集训练和推理过程

多喝开水少熬夜

已于 2024-10-18 11:26:36 修改

阅读量4.3k

点赞数 26

CC 4.0 BY-SA版权

分类专栏：大模型文章标签： python 图像处理 co-detr 模型训练和推理

于 2024-06-28 19:59:53 首次发布

本文链接：https://blog.csdn.net/m0_52695557/article/details/140051810

大模型专栏收录该内容

8 篇文章

订阅专栏

CO-DETR利用coco数据集训练和推理过程，参考链接
Co-DETR训练自己的数据集

前言

环境：PyTorch 1.11.0 Python 3.8(ubuntu20.04) Cuda 11.3
先是在github上下载CO-DETR模型

!git clone https://github.com/Sense-X/Co-DETR.git
%cd Co-DETR

然后加载所需库

!pip install -r requirements.txt

安装mmcv等（注意mmcv应该是1.6.1版本及以上）

!pip install -U openmim

!mim install mmcv-full==1.6.1

!pip install timm==0.6.11 mmdet==2.25.3

因为出现了mmdetection 报错 TypeError: FormatCode() got an unexpected keyword argument ‘verify‘问题，用一下方案解决： yapf版本过高，目前版本为 0.40.2，重装低版本yapf即可

!pip uninstall yapf
!pip install yapf==0.40.1

然后把解压好的coco数据集放到/Co-DETR/data/coco/目录下，如下图

在这里插入图片描述

如果出现ModuleNotFoundError: No module named ‘projects‘错误，在相应python文件中（一般是train.py）添加

import sys
 
sys.path.append('你的项目的绝对路径')

例如：

#/Co-DETR/tools/train.py里面修改
import sys
 
sys.path.append('/root/autodl-tmp/Co-DETR')
from projects import *

训练过程

多卡训练

在/Co-DETR目录下，终端输入：

bash tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 2 /root/autodl-tmp/Co-DETR

因为是分布式训练，需要用到两张以上显卡，比如两张4090（第三个参数是gpu数量），第一个参数是利用Pytorch的torch.distributed 实现单机多卡分布式训练的shell脚本文件，第二个是模型配置文件的位置，采用的是co_deformable_detr_r50_1x_coco.py模型，第四个参数是跑出的权重放置的位置。

运行的日志如下图所示：

在这里插入图片描述

下图是正常运行时终端正在运行的情况，正在进行第一轮训练：

在这里插入图片描述

然后等代码跑完12轮就行了，两张4090三四个小时跑完一轮，如下图

在这里插入图片描述

应该可以通过修改如下红色区域的值来修改跑的轮次

在这里插入图片描述

单卡训练

修改tools/dist_train.sh 的内容，比如创建一个single_train.sh

CONFIG=$1  
WORKDIR=$2  
  
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH  
export PYTHONPATH  
  
python $(dirname "$0")/train.py $CONFIG --work-dir $WORKDIR

如果要修改gpu的id，在最后一行前面加一个CUDA_VISIBLE_DEVICES=X ，例如：CUDA_VISIBLE_DEVICES=2 python $(dirname "$0")/train.py $CONFIG --work-dir $WORKDIR
在/Co-DETR目录下，终端输入：

bash tools/single_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py /root/autodl-tmp/Co-DETR/checkpoints/

能正常运行，并且发现输出中 Epoch [1][50/59144]也确实是上面双卡的两倍，因为运行完一轮要的时间比较长就不花这个钱去等运行结果了，反正应该和上面输出差不多。

在这里插入图片描述

推理过程

可以用上面跑的权重，或者想简单点直接在官方代码中找到对应模型的权重，下载到服务器（gpu）里

在这里插入图片描述

命令行

用scp或者wget等方式下载一张行车记录仪的记录图片到/Co-DETR/deno目录下，比如test.png，运行下面命令，在当前目录下会出现识别后的out2.png图像。

!python demo/image_demo.py demo/test.png \
projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py \
checkpoints/co_deformable_detr_r50_1x_coco.pth \
--device cuda \
--out-file out2.png

第一个参数是图片位置，第二个参数是模型配置位置，第三个参数是权重位置，第五个参数收识别图像的输出位置。
该图像是运行官网的权重后的结果

在这里插入图片描述
下图是运行上面第一轮训练后的latest.pth权重的结果

在这里插入图片描述

明显才训练一轮的识别结果更差，并且还有远处的一辆truck没识别到

python代码

也可以写成一个python函数（类）进行推理，下面代码输出识别后的图片和box坐标和类别

from mmcv import Config
from mmdet.models import build_detector
from mmdet.apis import inference_detector, init_detector, show_result_pyplot
import cv2
import numpy as np


def codetr_model_predict(image_path, config_path, checkpoint_path, device='cuda:0'):
    # 加载图片
    image = cv2.imread(image_path)
    if image is None:
        raise FileNotFoundError(f"Could not find image at {image_path}")
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)#将图像从 BGR 色彩空间转换为 RGB 色彩空间

    # 初始化Co-DETR模型
    model = init_detector(config_path, checkpoint_path, device=device)
    if model is None:
        raise ValueError("Failed to initialize model")

    # 进行推理
    result = inference_detector(model, image)
    if result is None:
        raise ValueError("Inference failed")

    # 准备输出，输出为box坐标和分类后的类别索引
    bboxes = []
    labels = []
    for class_id, detections in enumerate(result):
        if detections.shape[0] > 0:
            for detection in detections:
                if detection[-1] > 0.3:  # 置信度>0.3的才进行存储
                    bboxes.append(detection[:4])
                    labels.append(class_id)

    # 转化为np数组
    boxes = np.array(bboxes)
    categories = np.array(labels)

    show_result_pyplot(model, image, result, score_thr=0.3, out_file='demo_test2.png')#将图片输出到当前目录下

    return boxes, categories


# 配置路径 权重文件路径 测试图片路径
config_path = 'projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py'
checkpoint_path = 'path/to/your/checkpoint.pth'
image_path = 'path/to/your/image.jpg'

boxes, categories = codetr_model_predict(image_path, config_path, checkpoint_path)

print("Bounding Boxes:", boxes)
print("Categories:", categories)

输出结果：
在这里插入图片描述

用了几种权重和模型，发现co_dino_5scale_swin_large_16e_o365tococo-002.pth和co_dino_5scale_swin_large_3x_coco-005.pth权重都挺好的，大家可以自行在github上查找权重和其对应的config进行修改。