docker容器部署pytorch模型，gpu加速部署运行

原创已于 2023-05-09 10:58:44 修改 · 3.1w 阅读

49 ·

CC 4.0 BY-SA版权

文章标签：

#docker #pytorch #容器

于 2022-01-24 19:57:39 首次发布

NLP实战项目同时被 3 个专栏收录

202 篇文章

订阅专栏

linux

179 篇文章

订阅专栏

docker

98 篇文章

订阅专栏

参考文章
https://www.zhihu.com/search?type=content&q=Docker%EF%BC%8C%E6%95%91%E4%BD%A0%E4%BA%8E%E3%80%8C%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E7%8E%AF%E5%A2%83%E9%85%8D%E7%BD%AE%E3%80%8D%E7%9A%84%E8%8B%A6%E6%B5%B7
https://blog.csdn.net/zhouchen1998/article/details/110679750

步骤一，安装docker

https://docs.docker.com/engine/install/centos/

步骤二、安装Nvidia-docker

因为原本的docker不支持GPU加速，所以NVIDIA单独做了一个docker，来让docker镜像可以使用NVIDIA的gpu
github链接：https://github.com/NVIDIA/nvidia-docker
安装文档：https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
在这里插入图片描述

步骤三修改镜像默认存储空间，防止空间不足

https://www.cnblogs.com/bigberg/p/8057807.html

cd /etc/systemd/system/multi-user.target.wants
 
vim docker.service
ExecStart=/usr/bin/dockerd --graph=/data/docker --storage-driver=overlay --registry-mirror=https://jxus37ad.mirror.aliyuncs.com

–graph=/data/docker：docker新的存储位置
–storage-driver=overlay ：当前docker所使用的存储驱动

重启docker

systemctl daemon-reload
 
systemctl restart docker

docker info

步骤四确定服务器的显卡型号和驱动

nvidia-smi
在这里插入图片描述

步骤五根据型号下载对应的带cuda的pytorch基础镜像

https://hub.docker.com/r/pytorch/pytorch/tags?page=1&ordering=last_updated&name=10.1

在这里插入图片描述

步骤六，根据dockerfile制作镜像

xx代表镜像名
xx_xx代表入库函数

docker build -t xx:1.0.0 .

FROM pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime

COPY . /deploy
WORKDIR /deploy


RUN sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list \
    && apt-get clean \
    && apt-get update \
    && pip config set global.index-url https://mirror.baidu.com/pypi/simple \
    && pip install --upgrade setuptools \
    && pip install --upgrade pip \
    && pip install -r requirements.txt


EXPOSE 9535
ENTRYPOINT ["gunicorn", "-c", "gunicorn_cfg.py", "xx_xx:app"]

步骤七，运行镜像制作容器，注意打开gpu模型

xxx代表容器名，xx代表镜像名

docker run --gpus all --name xxx --net host -d xx:1.0.0

步骤八，进入容器验证gpu是否运行成功

docker exec -it xxx /bin/bash

nvidia-smi

显示如下，代表配置成功
在这里插入图片描述

步骤九，快速验证pytorch gpu版本是否可用

如果不可用，有可能是其他应用占用gpu资源了

python
import torch
print(torch.__version__)
print(torch.version.cuda)
print(torch.backends.cudnn.version())
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.get_device_name(0)
torch.cuda.current_device()

在这里插入图片描述

参考资料

dockerfile做镜像，创容器，也可以推到远程仓库被下拉

在这里插入图片描述

dockerhub,镜像官方网站，很多应用的基础镜像源在这里都有。

直接下载即可，python都安装好了
https://hub.docker.com/r/pytorch/pytorch/tags镜像官方网站
记住需要先注册账号，下拉镜像的时候可以先登陆，防止没有权限的错误

unauthorized: authentication required

在这里插入图片描述

快速验证pytorch gpu版本是否可用

如果不可用，有可能是其他应用占用gpu资源了

import torch
print(torch.__version__)

print(torch.version.cuda)
print(torch.backends.cudnn.version())

torch.cuda.is_available()
#cuda是否可用；

torch.cuda.device_count()
#返回gpu数量；

torch.cuda.get_device_name(0)
#返回gpu名字，设备索引默认从0开始；

torch.cuda.current_device()
#返回当前设备索引