file-type

图像分割技术及简单算法实现介绍

版权申诉

RAR文件

1.15MB | 更新于2024-11-27 | 186 浏览量 | 0 下载量 举报 收藏
download 限时特惠:#14.90
图像分割的目的是简化或改变图像的表示,使得更容易理解和分析。在应用层面,图像分割对于医学影像分析、视频监控、图像识别以及自动驾驶系统等都至关重要。图像分割技术可以分为若干类别,如阈值分割、边缘检测、区域提取、聚类等。" 知识点一:图像分割的重要性 图像分割在多种领域有着广泛的应用。例如,在医学领域,图像分割可以用来自动识别肿瘤或者器官的边界,对于疾病的诊断和治疗规划至关重要。在自动驾驶汽车中,通过图像分割技术可以识别道路上的车辆、行人、交通标志等,为车辆的自主导航提供重要信息。 知识点二:图像分割的基本算法 图像分割算法按照处理方式可以分为基于边界和基于区域两大类。 1. 基于边界的分割方法包括边缘检测算法,如Sobel算子、Canny边缘检测等。这些算法通常用于识别图像中物体的轮廓。 2. 基于区域的分割方法包括区域生长算法、区域提取算法、分水岭算法等。这些方法侧重于将具有相似特征的像素划分为同一区域。 知识点三:图像分割的技术发展 随着深度学习技术的兴起,基于深度学习的图像分割方法逐渐成为主流。卷积神经网络(CNN)特别是全卷积网络(FCN)以及后续的U-Net、SegNet等模型在图像分割任务上取得了显著成效,它们通过从大量标注数据中学习,自动提取有用的特征,实现对复杂图像的有效分割。 知识点四:图像分割的挑战 尽管图像分割技术已经取得了长足的进步,但仍然面临若干挑战。如在不同光照条件下对同一物体的识别、在复杂背景中准确分割前景物体、以及处理高分辨率图像时的计算效率问题等。 知识点五:图像分割的应用实例 图像分割的应用非常广泛。例如: 1. 在卫星遥感中,通过图像分割可以对地表特征进行识别,如河流、森林和城市区域。 2. 在工业制造领域,图像分割用于缺陷检测,提高生产质量和安全性。 3. 在视频流媒体中,图像分割用于场景切换检测,以及视频内容的自动摘要。 4. 在游戏和虚拟现实中,图像分割技术有助于实时追踪用户的手势和动作,创造更沉浸的用户体验。 知识点六:图像分割的学习资源 对于学习图像分割技术,有许多优秀的在线资源和学术论文。一些推荐的学习资源包括: 1. 在线课程平台,如Coursera、edX和Udacity提供的计算机视觉相关课程。 2. 计算机视觉和图像处理的经典教科书,如《数字图像处理》、《计算机视觉:算法与应用》等。 3. 开源项目和代码库,如OpenCV、TensorFlow、PyTorch等,它们提供了大量的图像处理和分割工具。 知识点七:图像分割的未来趋势 随着技术的持续发展,图像分割技术将朝着更高精度、更快处理速度和更低计算资源需求的方向发展。其中,半监督和无监督学习方法将可能成为未来研究的热点,以降低对大量标注数据的依赖。同时,多模态图像分割,即将来自不同传感器的数据结合起来进行分割,也将是一个重要的研究方向。 总结来说,图像分割作为计算机视觉的重要组成部分,在实际应用中扮演着极为重要的角色。随着深度学习等新技术的应用,图像分割的准确性和效率都得到了极大的提升,但同时也带来了新的挑战和研究空间。

相关推荐

filetype

给出相同功能的代码import os import numpy as np import nibabel as nib import imageio from PIL import Image def read_niifile(niifilepath): # 读取niifile文件 img = nib.load(niifilepath) # 提取niifile文件 img_fdata = img.get_fdata(dtype='float32') return img_fdata def save_fig(niifilepath, savepath, num, name): # 保存为图片 name = name.split('-')[1] filepath_seg = niifilepath + "segmentation\" + "segmentation-" + name filepath_vol = niifilepath + "volume\" + "volume-" + name savepath_seg = savepath + "segmentation\" savepath_vol = savepath + "volume\" if not os.path.exists(savepath_seg): os.makedirs(savepath_seg) if not os.path.exists(savepath_vol): os.makedirs(savepath_vol) fdata_vol = read_niifile(filepath_vol) fdata_seg = read_niifile(filepath_seg) (x, y, z) = fdata_seg.shape total = x * y for k in range(z): silce_seg = fdata_seg[:, :, k] if silce_seg.max() == 0: continue else: silce_seg = (silce_seg - silce_seg.min()) / (silce_seg.max() - silce_seg.min()) * 255 silce_seg = np.uint8(Image.fromarray(silce_seg).convert('L')) silce_seg = cv2.threshold(silce_seg, 1, 255, cv2.THRESH_BINARY)[1] if (np.sum(silce_seg == 255) / total) > 0.015: silce_vol = fdata_vol[:, :, k] silce_vol = (silce_vol - silce_vol.min()) / (silce_vol.max() - silce_vol.min()) * 255 silce_vol = np.uint8(Image.fromarray(silce_vol).convert('L')) imageio.imwrite(os.path.join(savepath_seg, '{}.png'.format(num)), silce_seg) imageio.imwrite(os.path.join(savepath_vol, '{}.png'.format(num)), silce_vol) num += 1 return num if name == 'main': path = r'C:\Users\Administrator\Desktop\LiTS2017' savepath = r'C:\Users\Administrator\Desktop\2D-LiTS2017' filenames = os.listdir(path + "segmentation") num = 0 for filename in filenames: num = save_fig(path, savepath, num, filename)

filetype

解释每一行代码import os import numpy as np import nibabel as nib import imageio import cv2 def read_niifile(niifilepath): # 读取niifile文件 img = nib.load(niifilepath) # 提取niifile文件 img_fdata = img.get_fdata(dtype='float32') return img_fdata def save_fig(niifilepath, savepath, num, name): # 保存为图片 name = name.split('-')[1] filepath_seg = niifilepath + "segmentation\\" + "segmentation-" + name filepath_vol = niifilepath + "volume\\" + "volume-" +name savepath_seg = savepath + "segmentation\\" savepath_vol = savepath + "volume\\" if not os.path.exists(savepath_seg): os.makedirs(savepath_seg) if not os.path.exists(savepath_vol): os.makedirs(savepath_vol) fdata_vol = read_niifile(filepath_vol) fdata_seg = read_niifile(filepath_seg) (x, y, z) = fdata_seg.shape total = x * y for k in range(z): silce_seg = fdata_seg[:, :, k] # 三个位置表示三个不同角度的切片 if silce_seg.max() == 0: continue else: silce_seg = (silce_seg-silce_seg.min())/(silce_seg.max() - silce_seg.min())*255 silce_seg = cv2.threshold(silce_seg, 1, 255, cv2.THRESH_BINARY)[1] if (np.sum(silce_seg == 255) / total) > 0.015: silce_vol = fdata_vol[:, :, k] silce_vol = (silce_vol - silce_vol.min()) / (silce_vol.max() - silce_vol.min()) * 255 imageio.imwrite(os.path.join(savepath_seg, '{}.png'.format(num)), silce_seg) imageio.imwrite(os.path.join(savepath_vol, '{}.png'.format(num)), silce_vol) num += 1 # 将切片信息保存为png格式 return num if __name__ == '__main__': path= 'E:\\dataset\\LiTS17\\' savepath = 'E:\\dataset\\LiTS17\\2d\\' filenames = os.listdir(path + "segmentation") num = 0 for filename in filenames: num = save_fig(path, savepath, num, filename)

filetype

class Nutrition5kDataset(Dataset): def __init__(self, dish_metadata, seg_dir, remove_list, transform=None, mode='seg'): self.seg_dir = seg_dir self.transform = transform self.mode = mode # 加载所有餐厅的元数据(图3) self.all_dish_metadata = self._load_all_dish_metadata(dish_metadata) # 过滤无效数据 self.data = self._filter_data(remove_list, self.all_dish_metadata) # 创建dish_id到图片路径的映射 self.dish_img_paths = self._create_img_path_map() def __len__(self): return len(self.data) def __getitem__(self, idx): dish_id, meta = list(self.data.items())[idx] # 获取图像路径 img_path = self.dish_img_paths.get(dish_id, None) if not img_path: raise FileNotFoundError(f"Image for dish_id {dish_id} not found") img = Image.open(img_path).convert('RGB') if self.mode == 'seg': # 加载分割标签 seg_path = os.path.join(self.seg_dir, f"{dish_id}.json") seg_data = json.load(open(seg_path)) mask = self._decode_mask(seg_data, img.size) if self.transform: img = self.transform(img) mask = self.transform(mask) return img, mask else: # regression mode # 应用分割 seg_path = os.path.join(self.seg_dir, f"{dish_id}.json") seg_data = json.load(open(seg_path)) seg_img = self._apply_segmentation(img, seg_data) # 提取营养数据 nutrition = torch.tensor([ meta.get('mass', 0), meta.get('calories', 0), meta.get('fat', 0), meta.get('carb', 0), meta.get('protein', 0) ], dtype=torch.float32) if self.transform: seg_img = self.transform(seg_img) return seg_img, nutrition 这是修改之前的代码,请你在帮我把之前代码修改一下

filetype

第一个改了之后报错啦(covid_seg) (base) liulicheng@ailab-MS-7B79:~/MultiModal_MedSeg_2025$ /home/liulicheng/anaconda3/envs/covid_seg/bin/python /home/liulicheng/MultiModal_MedSeg_2025/train/train_swinunetr_clipfusion.py 使用尺寸: Resized=(128, 128, 64), Crop=(64, 64, 32) /home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/utils/deprecate_utils.py:221: FutureWarning: monai.networks.nets.swin_unetr SwinUNETR.__init__:img_size: Argument `img_size` has been deprecated since version 1.3. It will be removed in version 1.5. The img_size argument is not required anymore and checks on the input size are run during forward(). warn_deprecated(argname, msg, warning_category) /home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/utils/deprecate_utils.py:221: FutureWarning: monai.losses.dice DiceCELoss.__init__:ce_weight: Argument `ce_weight` has been deprecated since version 1.2. It will be removed in version 1.4. please use `weight` instead. warn_deprecated(argname, msg, warning_category) Epoch 1/200 训练 Epoch 1: 0%| | 0/104 [00:00<?, ?it/s]enc_out_list 通道数: [12, 24, 48, 96, 192] enc_out_list 各层特征图尺寸: [torch.Size([1, 12, 32, 32, 16]), torch.Size([1, 24, 16, 16, 8]), torch.Size([1, 48, 8, 8, 4]), torch.Size([1, 96, 4, 4, 2]), torch.Size([1, 192, 2, 2, 1])] 训练 Epoch 1: 0%| | 0/104 [00:01<?, ?it/s] Traceback (most recent call last): File "/home/liulicheng/MultiModal_MedSeg_2025/train/train_swinunetr_clipfusion.py", line 283, in <module> outputs = model(inputs, text_feat) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/liulicheng/MultiModal_MedSeg_2025/train/train_swinunetr_clipfusion.py", line 149, in forward residual = self.residual_conv(enc_out) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 610, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 605, in _conv_forward return F.conv3d( File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/data/meta_tensor.py", line 282, in __torch_function__ ret = super().__torch_function__(func, types, args, kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/_tensor.py", line 1386, in __torch_function__ ret = func(*args, **kwargs) RuntimeError: Given groups=1, weight of size [12, 192, 1, 1, 1], expected input[1, 12, 2, 2, 1] to have 192 channels, but got 12 channels instead

filetype

# 导入需要使用的包 import csv import os from itertools import islice import mxnet as mx from mxnet import image, gpu import gluoncv from gluoncv.data.transforms.presets.segmentation import test_transform from gluoncv.utils.viz import get_color_pallete,plot_image import matplotlib.pyplot as plt import matplotlib.image as mpimg import matplotlib matplotlib.use('Agg') import pandas as pd import cv2 file_path = './pic' out_path='./out' filelist_out = os.listdir(out_path) # 关于图片重构的尺寸参数 # 如果图像的长宽 大于 max_size 则需要重构。 max_size=2200 # reshape_size 为重构后的尺寸 reshape_size=2000 # 忽略警告 import warnings; warnings.filterwarnings(action='once') warnings.filterwarnings("ignore") # 设定使用GPU或者CUP进行计算,没有安装GPU版本的MXNet请使用CPU ctx = mx.gpu(0) ctx = mx.cpu(0) col_map =col_map = {0:'road', 1:'sidewalk', 2:'building', 3:'wall', 4:'fence', 5:'pole', 6:'traffic light', 7:'traffic sign', 8:'vegetation', 9:'terrain', 10:'sky', 11:'person', 12:'rider', 13:'car', 14:'truck', 15:'bus', 16:'train', 17:'motorcycle', 18:'bicycle'} # 定义函数对单张图片进行图像分割,并将结果存为pd.Series def get_seg(file, model): img = image.imread(file) img = test_transform(img,ctx=ctx) output = model.predict(img) predict = mx.nd.squeeze(mx.nd.argmax(output, 1)).asnumpy() pred = [] for i in range(150): pred.append((len(predict[predict==i])/(predict.shape[0]*predict.shape[1]))) pred = pd.Series(pred).rename(col_map) return pred model = gluoncv.model_zoo.get_model('deeplab_resnet101_citys',ctx=ctx,pretrained=True ) filelist = os.listdir(file_path) df = pd.DataFrame(columns=['id','lng','lat','pixels','road','sidewalk','building','wall','fence', 'pole','traffic light','traffic sign','vegetation','terrain','sky', 'person','rider','car','truck','bus','train','motorcycle','bicycle']) print(df) # 图片重构后的分辨率:需要根据原始图片的分辨率比例进行设置 for i in filelist: # 循环遍历所有的图片进行语义分割,并将结果存如pd.DataFrame # i 是图片名 if i not in filelist_out: img_path = os.path.join(file_path, i) img_id = i # 图片的经纬度信息和朝向信息 如果图片名称中有这些信息,则可从图片名称 i 中解析。 lng = 0 lat = 0 img = cv2.imread(img_path) # 读取完后将路径改成临时路径 img_path = img_path.replace('pic', 'image_processed') # 获取原始图片尺寸 ori_size=[img.shape[1],img.shape[0]] # 如果图片尺寸过大则进行图片的重构 if ori_size[0]>max_size: Scale_Factor=ori_size[0]/reshape_size img_size = (int(ori_size[0]/Scale_Factor), int(ori_size[1]/Scale_Factor)) print(i,ori_size,'Resize to:',img_size) else: img_size = (int(ori_size[0]), int(ori_size[1])) img2 = cv2.resize(img, img_size, interpolation=cv2.INTER_CUBIC) cv2.imwrite(img_path, img2) pixels = img_size[0] * img_size[1] data_i = pd.Series({'id': img_id, 'lng': lng, 'lat': lat, 'pixels': pixels}).append(get_seg(img_path, model)) new_col = pd.DataFrame(data_i).T df = pd.concat([df, new_col], axis=0, join='outer', ignore_index=True) img = image.imread(img_path) img = test_transform(img,ctx=ctx) output = model.predict(img) predict = mx.nd.squeeze(mx.nd.argmax(output, 1)).asnumpy() # 设定可视化方式 mask = get_color_pallete(predict, 'citys') print(i+' seg has saved!') base = mpimg.imread(img_path) plt.figure(figsize=(10,5)) # 显示原始图像 plt.imshow(base) # 显示标签色块 plt.imshow(mask,alpha=0.8) plt.axis('off') plt.savefig(out_path+'/'+i,dpi=300,bbox_inches='tight') plt.close('all') # 输出结果的路径 csv文件 df.to_csv("./img_seg_deeplab_cityscape.csv") # 将结果保存到csv else: print(i,'已经存在!') 这是现在的代码,请问如何迁移到pytorch框架,请详细说明

filetype

class CocoDataset(BaseDetDataset): """Dataset for COCO.""" METAINFO = { 'classes': ('gas', ), # palette is a list of color tuples, which is used for visualization. 'palette': [(220, 20, 60),] } COCOAPI = COCO # ann_id is unique in coco dataset. ANN_ID_UNIQUE = True def load_data_list(self) -> List[dict]: """Load annotations from an annotation file named as ``self.ann_file`` Returns: List[dict]: A list of annotation. """ # noqa: E501 with get_local_path( self.ann_file, backend_args=self.backend_args) as local_path: self.coco = self.COCOAPI(local_path) # The order of returned `cat_ids` will not # change with the order of the `classes` self.cat_ids = self.coco.get_cat_ids( cat_names=self.metainfo['classes']) self.cat2label = {cat_id: i for i, cat_id in enumerate(self.cat_ids)} self.cat_img_map = copy.deepcopy(self.coco.cat_img_map) img_ids = self.coco.get_img_ids() data_list = [] total_ann_ids = [] for img_id in img_ids: raw_img_info = self.coco.load_imgs([img_id])[0] raw_img_info['img_id'] = img_id ann_ids = self.coco.get_ann_ids(img_ids=[img_id]) raw_ann_info = self.coco.load_anns(ann_ids) total_ann_ids.extend(ann_ids) parsed_data_info = self.parse_data_info({ 'raw_ann_info': raw_ann_info, 'raw_img_info': raw_img_info }) data_list.append(parsed_data_info) if self.ANN_ID_UNIQUE: assert len(set(total_ann_ids)) == len( total_ann_ids ), f"Annotation ids in '{self.ann_file}' are not unique!" del self.coco return data_list def parse_data_info(self, raw_data_info: dict) -> Union[dict, List[dict]]: """Parse raw annotation to target format. Args: raw_data_info (dict): Raw data information load from ``ann_file`` Returns: Union[dict, List[dict]]: Parsed annotation. """ img_info = raw_data_info['raw_img_info'] ann_info = raw_data_info['raw_ann_info'] data_info = {} # TODO: need to change data_prefix['img'] to data_prefix['img_path'] img_path = osp.join(self.data_prefix['img'], img_info['file_name']) if self.data_prefix.get('seg', None): seg_map_path = osp.join( self.data_prefix['seg'], img_info['file_name'].rsplit('.', 1)[0] + self.seg_map_suffix) else: seg_map_path = None data_info['img_path'] = img_path data_info['img_id'] = img_info['img_id'] data_info['seg_map_path'] = seg_map_path data_info['height'] = img_info['height'] data_info['width'] = img_info['width'] if self.return_classes: data_info['text'] = self.metainfo['classes'] data_info['caption_prompt'] = self.caption_prompt data_info['custom_entities'] = True instances = [] for i, ann in enumerate(ann_info): instance = {} if ann.get('ignore', False): continue x1, y1, w, h = ann['bbox'] inter_w = max(0, min(x1 + w, img_info['width']) - max(x1, 0)) inter_h = max(0, min(y1 + h, img_info['height']) - max(y1, 0)) if inter_w * inter_h == 0: continue if ann['area'] <= 0 or w < 1 or h < 1: continue if ann['category_id'] not in self.cat_ids: continue bbox = [x1, y1, x1 + w, y1 + h] if ann.get('iscrowd', False): instance['ignore_flag'] = 1 else: instance['ignore_flag'] = 0 instance['bbox'] = bbox instance['bbox_label'] = self.cat2label[ann['category_id']] if ann.get('segmentation', None): instance['mask'] = ann['segmentation'] instances.append(instance) data_info['instances'] = instances return data_info def filter_data(self) -> List[dict]: """Filter annotations according to filter_cfg. Returns: List[dict]: Filtered results. """ if self.test_mode: return self.data_list if self.filter_cfg is None: return self.data_list filter_empty_gt = self.filter_cfg.get('filter_empty_gt', False) min_size = self.filter_cfg.get('min_size', 0) # obtain images that contain annotation ids_with_ann = set(data_info['img_id'] for data_info in self.data_list) # obtain images that contain annotations of the required categories ids_in_cat = set() for i, class_id in enumerate(self.cat_ids): ids_in_cat |= set(self.cat_img_map[class_id]) # merge the image id sets of the two conditions and use the merged set # to filter out images if self.filter_empty_gt=True ids_in_cat &= ids_with_ann valid_data_infos = [] for i, data_info in enumerate(self.data_list): img_id = data_info['img_id'] width = data_info['width'] height = data_info['height'] if filter_empty_gt and img_id not in ids_in_cat: continue if min(width, height) >= min_size: valid_data_infos.append(data_info) return valid_data_infos

filetype

import SimpleITK as sitk import cv2 import numpy as np import matplotlib.pyplot as plt # 读取目标图像 target_img = cv2.imread('./plot_abdomenreg_arrow.jpg') # 读取目标图像 target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2RGB) # 转换颜色空间 # 读取分割图像 seg_data = sitk.GetArrayFromImage(sitk.ReadImage('./logs/abdomenreg/SegReg_edt/0029_0030_reg/seg_fixed.nii.gz')) # 提取第二维度第40层切片(注意维度顺序) # 根据seg_image.nii.gz的尺寸(96,80,128),应取中间维度 seg_slice = seg_data[:, 40, :] # 得到(96,128)的二维数组 seg_slice = np.flipud(seg_slice) # 上下翻转图像 # 调整分割图像的尺寸以匹配目标图像的尺寸 target_height, target_width, _ = target_img.shape seg_slice_resized = cv2.resize(seg_slice, (target_width, target_height), interpolation=cv2.INTER_NEAREST) # 创建空白轮廓图像 contour_img = np.zeros_like(target_img) # 设置颜色映射 labels = np.unique(seg_slice_resized) colors = plt.cm.jet(np.linspace(0, 1, len(labels))) # 绘制轮廓 for label, color in zip(labels, colors): if label == 0: continue mask = (seg_slice_resized == label).astype(np.uint8) contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) cv2.drawContours( contour_img, contours, -1, (int(color[0]*255), int(color[1]*255), int(color[2]*255)), 1 ) # 叠加显示 combined = cv2.addWeighted(target_img, 1, contour_img, 1, 0) # 保存结果 cv2.imwrite('contour_overlay.png', cv2.cvtColor(combined, cv2.COLOR_RGB2BGR)) cv2.imwrite('contour_seg.png', cv2.cvtColor(combined, cv2.COLOR_RGB2BGR))把这段代码改为plt作图

filetype

有报错了 Epoch 1/200 训练 Epoch 1: 0%| | 0/52 [00:01<?, ?it/s] Traceback (most recent call last): File "/home/liulicheng/MultiModal_MedSeg_2025/train/lits_swinunetr_dynunet_advanced.py", line 235, in <module> outputs = model(inputs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/networks/nets/swin_unetr.py", line 325, in forward hidden_states_out = self.swinViT(x_in, self.normalize) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/networks/nets/swin_unetr.py", line 1064, in forward x0_out = self.proj_out(x0, normalize) File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/networks/nets/swin_unetr.py", line 1051, in proj_out x = rearrange(x, "n c d h w -> n d h w c") File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/utils/module.py", line 448, in __call__ raise self._exception File "/home/liulicheng/anaconda3/envs/covid_seg/lib/python3.8/site-packages/monai/utils/module.py", line 399, in optional_import pkg = __import__(module) # top level module monai.utils.module.OptionalImportError: from einops import rearrange (No module named 'einops'). For details about installing the optional dependencies, please visit: https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies你看看

filetype

dataset_type = 'PascalVOCDataset' data_root = 'data/VOCdevkit/VOC2012' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/train.txt', pipeline=train_pipeline), val=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=test_pipeline), test=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=test_pipeline)) 解释一下这个代码

filetype

模型是应用yolov8n 分割模型训练: from ultralytics import YOLO import torch # Load a model model = YOLO('yolov8-seg.yaml') # build a new model from YAML model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training) model = YOLO('yolov8-seg.yaml').load('yolov8n.pt') # build from YAML and transfer weights results = model.train(pose=True, data='./datasets/Dataset_A2C_2025-08-05-1/Dataset_A2C_2025-08-05-1.yaml', epochs=200, imgsz=256) 现想实现预测,如下代码,但报错,请帮我修正: import numpy as np import cv2 from numpy.array_api import uint8 from skimage.measure import label, regionprops from openvino.runtime import Core from PIL import Image import torch import openvino as ov import matplotlib.pyplot as plt from sympy.codegen.ast import int32 from triton.language import dtype class YiAtrium(): def __init__(self, device="CPU"): self.core = Core() self.model = self.core.read_model("/Work/zhangxin/ultralytics/runs/segment/train/weights/best_openvino_model/best.xml") self.compiled_model = self.core.compile_model(self.model, device) # 修改1:使用Opencv库读取数据,适配原有的代码。 # 修改2:输入cvmat图像数据 def preprocess(self, image): # img = cv2.imread(img_path,cv2.IMREAD_GRAYSCALE) # img = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) img_resized = cv2.resize(img, (256,256)) img_ndarray = img_resized if img_ndarray.ndim == 2: img_ndarray = img_ndarray[np.newaxis, ...] img_ndarray = img_ndarray[np.newaxis, ...] img_ndarray = img_ndarray / 255.0 tensor = torch.from_numpy(img_ndarray) img_tensor = tensor.permute(2, 0, 1).float().contiguous() # 或 tensor = tensor.permute(2, 0, 1).contiguous() img_tensor = img_tensor.unsqueeze(0) # 形状变为:(1, 3, 256, 256) #img_ndarray = img_ndarray / 255.0 #img_tensor = torch.as_tensor(img_ndarray.copy()).float().contiguous() return img_tensor @staticmethod def keep_largest_region(mask): if mask.max() == 0: return mask labeled_mask = label(mask, connectivity=1) regions = regionprops(labeled_mask) if not regions: return mask largest_region = max(regions, key=lambda r: r.area) result_mask = np.zeros_like(mask) result_mask[labeled_mask == largest_region.label] = 1 return result_mask @staticmethod def fill_hole(image): """填充图像中的空洞""" # 确保是二值图像 if len(np.unique(image)) > 2: _, image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY) src = image.copy() mask = np.zeros([src.shape[0] + 2, src.shape[1] + 2], np.uint8) # 找到背景点 isbreak = False for i in range(src.shape[0]): for j in range(src.shape[1]): if src[i, j] == 0: seedpoint = (j, i) # 注意坐标顺序 (x,y) isbreak = True break if isbreak: break cv2.floodFill(src, mask, seedpoint, 255) img_floofill_inv = cv2.bitwise_not(src) im_out = image | img_floofill_inv return im_out @staticmethod def get_edge_points(mask, scalex, scaley): contours, _ = cv2.findContours( mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE ) edge_points = [] for contour in contours: epsilon = 0.005 * cv2.arcLength(contour, True) approx = cv2.approxPolyDP(contour, epsilon, True) for point in approx: x, y = point[0] up_x = int(x * scaley) up_y = int(y * scalex) edge_points.append((up_x, up_y)) final_result = {"detection": {}, "segmentation": {}} point_dict = {"1-LVIntima_pylogon": edge_points.copy()} final_result["segmentation"] = point_dict return final_result # 原始:输入的是图像的路径 # def Yi_Segment(self, image_path): # input_tensor = self.preprocess(image_path) # predict_tensor = self.compiled_model(input_tensor) # output_ndarray = predict_tensor[0].argmax(1).squeeze(0) # prediction = output_ndarray.astype(np.uint8) # prediction = prediction * 255 # output_mask = self.fill_hole(self.keep_largest_region(prediction)) # final_output = self.get_edge_points(output_mask) # return final_output # 修改:输入cvmat图像数据 def Yi_Segment(self, image): input_tensor = self.preprocess(image) predict_tensor = self.compiled_model(input_tensor) output_ndarray = predict_tensor[0].argmax(1).squeeze(0) prediction = output_ndarray.astype(np.uint8) prediction = prediction * 255 output_mask = self.fill_hole(self.keep_largest_region(prediction)) img_h, img_w = image.shape[:2] scalex = img_h / 256.0 scaley = img_w / 256.0 final_output = self.get_edge_points(output_mask, scalex, scaley) #final_output = self.get_edge_points(final_output) return final_output if __name__ == "__main__": image_path = "/Work/zhangxin/ultralytics/runs/test/Heart_A2C_0000003_20240416_Comen_cropped_105.bmp" # segmenter = YiAtrium() image = cv2.imread(image_path) final_result = segmenter.Yi_Segment(image) img = cv2.imread(image_path) new_img = np.zeros((image.shape[0], image.shape[1]), dtype=uint8) cv2.drawContours(new_img, [np.asarray(final_result["segmentation"]["1-LVIntima_pylogon"])], -1, [255, 0], thickness=cv2.FILLED) cv2.imshow("1", new_img) cv2.imshow(("2"),img) cv2.waitKey(0)

爱牛仕
  • 粉丝: 120
上传资源 快速赚钱