进阶岛-第4关-L2G4000-InternVL 多模态模型部署微调实践

1、环境配置

我出现了同时显示虚拟环境和base环境的问题，用以下命令解决，不自动激活base环境。

conda config --set auto_activate_base False

1.1.训练环境配置

很多在基础关的xtuner已经安装过。

新建虚拟环境并进入:

conda create --name xtuner-env python=3.10 -y
conda activate xtuner-env

"xtuner-env"为训练环境名，可以根据个人喜好设置，在本教程中后续提到训练环境均指"xtuner-env"环境。

安装与deepspeed集成的xtuner和相关包：

pip install xtuner==0.1.23 timm==1.0.9
pip install 'xtuner[deepspeed]'
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install transformers==4.39.0 tokenizers==0.15.2 peft==0.13.2 datasets==3.1.0 accelerate==1.2.0 huggingface-hub==0.26.5

训练环境既为安装成功。

1.2.推理环境配置

配置推理所需环境：（我仍然使用我之前的lagent环境）

conda create -n lmdeploy python=3.10 -y
conda activate lmdeploy
pip install lmdeploy==0.6.1 gradio==4.44.1 timm==1.0.9

"lmdeploy"为推理使用环境名。

2、LMDeploy部署

2.1 LMDeploy基本用法介绍

通过pipeline.chat 接口来构造多轮对话管线，核心代码为：

## 1.导入相关依赖包
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
from lmdeploy.vl import load_image

## 2.使用你的模型初始化推理管线
model_path = "your_model_path"
pipe = pipeline(model_path,
                backend_config=TurbomindEngineConfig(session_len=8192))
                
## 3.读取图片（此处使用PIL读取也行）
image = load_image('your_image_path')

## 4.配置推理参数
gen_config = GenerationConfig(top_p=0.8, temperature=0.8)
## 5.利用 pipeline.chat 接口 进行对话，需传入生成参数
sess = pipe.chat(('describe this image', image), gen_config=gen_config)
print(sess.response.text)
## 6.之后的对话轮次需要传入之前的session，以告知模型历史上下文
sess = pipe.chat('What is the woman doing?', session=sess, gen_config=gen_config)
print(sess.response.text)

2.2 网页应用部署体验

先用网页体验一下与InternVL对话

拉取本教程的github仓库GitHub - Control-derek/InternVL2-Tutorial：

git clone https://github.com/Control-derek/InternVL2-Tutorial.git
cd InternVL2-Tutorial

demo.py文件中，MODEL_PATH处传入InternVL2-2B的路径，如果使用的是InternStudio的开发机则无需修改，否则改为模型路径。

启动demo:

conda activate lmdeploy
python demo.py

报错了？？？因为有些环境可能和给的有些不一样？

还真是版本的问题，这就运行成功了：

我开多轮对话没报错哎，神奇。

3、Xtuner微调实践

3.1 准备基本配置文件

激活我自己的训练环境

conda activate xtuner-env

原始internvl的微调配置文件在路径./xtuner/configs/internvl/v2下，假设上面克隆的仓库在/root/InternVL2-Tutorial,复制配置文件到目标目录下：

cp /root/InternVL2-Tutorial/xtuner_config/internvl_v2_internlm2_2b_lora_finetune_food.py /root/finetune/xtuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_lora_finetune_food.py
/root/finetune/xtuner/xtuner/configs/internvl/v2 #xtuner装在finetune目录下！！！！

/root/finetune/xtuner/xtuner/configs/internvl/v1_5/convert_to_official.py

3.2.配置文件参数解读

在第一部分的设置中，有如下参数：

path: 需要微调的模型路径，在InternStudio环境下，无需修改。
data_root: 数据集所在路径。
data_path: 训练数据文件路径。
image_folder: 训练图像根路径。
prompt_temple: 配置模型训练时使用的聊天模板、系统提示等。使用与模型对应的即可，此处无需修改。
max_length: 训练数据每一条最大token数。
batch_size: 训练批次大小，可以根据显存大小调整。
accumulative_counts: 梯度累积的步数，用于模拟较大的batch_size，在显存有限的情况下，提高训练稳定性。
dataloader_num_workers: 指定数据集加载时子进程的个数。
max_epochs:训练轮次。
optim_type:优化器类型。
lr: 学习率
betas: Adam优化器的beta1, beta2
weight_decay: 权重衰减，防止训练过拟合用
max_norm: 梯度裁剪时的梯度最大值
warmup_ratio: 预热比例，前多少的数据训练时，学习率将会逐步增加。
save_steps: 多少步存一次checkpoint
save_total_limit: 最多保存几个checkpoint，设为-1即无限制

LoRA相关参数：

r: 低秩矩阵的秩，决定了低秩矩阵的维度。
lora_alpha 缩放因子，用于调整低秩矩阵的权重。
lora_dropout dropout 概率，以防止过拟合。

如果想断点重训，可以在最下面传入参数：

把这里的load_from传入你想要载入的checkpoint，并设置resume=True即可断点重续。

3.3数据集下载

采用的是FoodieQA数据集，这篇文章中了2024EMNLP的主会，其引用信息如下：

@article{li2024foodieqa,
  title={FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture},
  author={Li, Wenyan and Zhang, Xinyu and Li, Jiaang and Peng, Qiwei and Tang, Raphael and Zhou, Li and Zhang, Weijia and Hu, Guimin and Yuan, Yifei and S{\o}gaard, Anders and others},
  journal={arXiv preprint arXiv:2406.11030},
  year={2024}
}

FoodieQA 是一个专门为研究中国各地美食文化而设计的数据集。它包含了大量关于食物的图片和问题，帮助多模态大模型更好地理解不同地区的饮食习惯和文化特色。这个数据集的推出，让我们能够更深入地探索和理解食物背后的文化意义。

直接使用share目录下处理好的数据集

export PYTHONPATH=/root/internvl_course:$PYTHONPATH  # 让python能找到第一步安装在其他路径下的包
export PATH=/root/internvl_course/bin:$PATH  # 让系统可以找到你安装的命令行工具
xtuner train /root/finetune/xtuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_lora_finetune_food.py --deepspeed deepspeed_zero2

空间不够？？我还用的30%A100，得解锁50%A100才行啊。

换了50%A100，微调启动成功：

微调后，把模型checkpoint的格式转化为便于测试的格式：

python /root/finetune/xtuner/xtuner/configs/internvl/v1_5/convert_to_official.py /root/finetune/xtuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_lora_finetune_food.py ./work_dirs/internvl_v2_internlm2_2b_lora_finetune_food/iter_640.pth ./work_dirs/internvl_v2_internlm2_2b_lora_finetune_food/lr35_ep10/ # 输出文件名可以按照喜好设置

如果修改了超参数，iter_xxx.pth需要修改为对应的想要转的checkpoint。 ./work_dirs/internvl_v2_internlm2_2b_lora_finetune_food/lr35_ep10/为转换后的模型checkpoint保存的路径，可以按喜好修改。