第八篇-Tesla P40+ChatGLM2+LoRA

木卫二号Coding

已于 2023-08-05 11:30:32 修改

阅读量2.1k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： AI-LLM-实战文章标签：人工智能 chatgpt

于 2023-08-02 21:45:50 首次发布

本文链接：https://blog.csdn.net/hai4321/article/details/132072097

AI-LLM-实战专栏收录该内容

47 篇文章

订阅专栏

文章介绍了如何在CentOS7环境中使用TeslaP40GPU进行ChatGLM模型的lora方式微调，包括环境配置、数据处理和参数设置，最后评估了初步效果并计划扩展数据集

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

部署环境

  系统：CentOS-7
  CPU: 14C28T
  显卡：Tesla P40 24G
  驱动: 515
  CUDA: 11.7
  cuDNN: 8.9.2.26

目的

验证P40部署可行性,只做验证学习lora方式微调

创建环境

conda create --name glm-tuning python=3.10
conda activate glm-tuning

克隆项目

git clone https://github.com/hiyouga/ChatGLM-Efficient-Tuning
cd ChatGLM-Efficient-Tuning

安装依赖

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

准备数据-少量测试-项目已提供分词好数据

准备数据
我们将下载好的数据集解压到 data 文件夹中，解压后的文件目录为：
data/
├── dataset_info.json
└── self_cognition/
├── dev.json
└── train.json
接下来，我们修改 dataset_info.json，增加以下两列内容，从而使训练框架能够识别自定义数据集。
测试dev.json与train.json一样的，生产环境需要分离

,
"self_cognition_train": {
    "file_name": "self_cognition/train.json",
    "columns": {
        "prompt": "content",
        "query": "",
        "response": "summary",
        "history": ""
    }
},
"self_cognition_dev": {
    "file_name": "self_cognition/dev.json",
    "columns": {
        "prompt": "content",
        "query": "",
        "response": "summary",
        "history": ""
    }
}

微调代码调整

accelerate launch src/train_bash.py \
    --stage sft \
    --do_train \
    --model_name_or_path  /models/chatglm2-6b \
    --dataset self_cognition_train \
    --finetuning_type lora \
    --output_dir self_cognition_lora \
    --overwrite_cache \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 2 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 1e-3 \
    --num_train_epochs 2.0 \
    --lora_rank 32 \
    --ddp_find_unused_parameters False \
    --source_prefix 你现在是一名销售员，根据以下商品标签生成一段有吸引力的商品广告词。 \
    --plot_loss \
    --fp16

如果调整了数据集，要清理缓存，缓存目录如下
/root/.cache/huggingface/datasets

Tue Aug  1 10:45:02 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P40           Off  | 00000000:03:00.0 Off |                    0 |
| N/A   61C    P0   184W / 250W |  13503MiB / 23040MiB |     94%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
{'train_runtime': 73.3871, 'train_samples_per_second': 2.18, 'train_steps_per_second': 0.545, 'train_loss': 1.7150115966796875, 'epoch': 2.0}                                    
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [01:13<00:00,  1.83s/it]***** train metrics *****
  epoch                    =        2.0
  train_loss               =      1.715
  train_runtime            = 0:01:13.38
  train_samples_per_second =       2.18
  train_steps_per_second   =      0.545

参数：参数根据自己硬件配置自己调整
温度：P40自己改个风冷散热，散热效果不好，奔着80度去了
显存：占用大概14G

模型测试

CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \
    --model_name_or_path  /models/chatglm2-6b \
    --checkpoint_dir self_cognition_lora

python src/web_demo.py --checkpoint_dir self_cognition_lora --model_name_or_path  /models/chatglm2-6b

Input: 你是谁
ChatGLM-6B: The dtype of attention mask (torch.int64) is not bool
我是AI小木,一个由小吕开发的人工智能助手,我可以回答各种问题,提供信息,甚至进行闲聊。

Input: 你是谁开发的
ChatGLM-6B: 我不是开发的,是由小吕开发的人工智能助手,旨在为用户提供有用的回答和帮助

总结

效果还行，我的参数都设置的比较小，速度挺快的2分钟，模型微调之后认识已经调整过来了
后面准备调整更大数据集，再做数据评测

–model_name_or_path /models/chatglm2-6b 注意指定

参考

https://hub.nuaa.cf/hiyouga/ChatGLM-Efficient-Tuning/blob/main/examples/ads_generation.md