【已解决】RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR Caught RuntimeError in replica 2 on devi

问题描述

        在执行这条训练指令时候出现了这个问题:

python train.py --batch_size 100 --max_epochs 60 --runname train --wm_batch_size 2 --wmtrain

        这个问题的出现是在进行训练的过程中,具体的报错内容如下,我会把对解决问题直接相关的内容标出来,按照这个思路可以把问题解决:

==> Preparing data..
Using CIFAR10 dataset.
Files already downloaded and verified
Files already downloaded and verified
Loading watermark images
==> Building model..
Using CUDA
Parallel training on 3 GPUs.

/home/visionx/anaconda3/envs/waterknn/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py:32: UserWarning: 
    There is an imbalance between your GPUs. You may want to exclude GPU 2 which
    has less than 75% of the memory or cores of GPU 0. You can do so by setting
    the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
    environment variable.
  warnings.warn(imbalance_warn.format(device_ids[min_pos], device_ids[max_pos]))
WM acc:
 [=========================== 50/50 =============================>.]  Step: 17ms | Tot: 969ms | Loss: 2.304 | Acc: 8.000% (8/100)       

Epoch: 0
Traceback (most recent call last):
  File "train.py", line 96, in <module>
    trainloader, device, wmloader)
  File "/home/visionx/project/WatermarkNN/trainer.py", line 48, in train
    outputs = net(inputs)
  File "/home/visionx/anaconda3/envs/waterknn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/visionx/anaconda3/envs/waterknn/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/visionx/anaconda3/envs/waterknn/lib/python3.7/site-packages/torch/nn/parallel/data_parallel

评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值