注意力机制代码学习坑1 20250821

1.安装ortools时会改变mumpy环境,安装完后如果出现此类错误需要将numpy降级(注意,可能有其他兼容性问题,这部分目前还没报错2025.8.21)

2.安装之后提示重启内核,需要关闭pycharm并且重启软件

3.错误:TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 

numpy只能处理cpu上的张量,需要进行如下更改:

原来的:# 修改前t, p = ttest_rel(rewards.numpy(), bl_vals.numpy()
修改后:t, p = ttest_rel(rewards.cpu().numpy(), bl_vals.cpu().numpy())

4. 第一次输出:
 

Ep.#  1/20 : 100%|██████████| 1000/1000 [01:37<00:00, 10.22it/s, l=89.42 p=  0.01327 val= -2967 bl=-16.49 |g|=101.1]
Ep.#  2/20 : 100%|██████████| 1000/1000 [01:36<00:00, 10.34it/s, l=103.7 p= 0.003209 val= -2985 bl=-17.68 |g|=92.83]
Ep.#  3/20 : 100%|██████████| 1000/1000 [01:36<00:00, 10.32it/s, l=91.95 p= 0.009845 val= -2916 bl=-18.12 |g|=126.7]
Ep.#  4/20 : 100%|██████████| 1000/1000 [01:36<00:00, 10.39it/s, l=100.9 p= 0.005806 val= -2856 bl=   -20 |g|=131.7]
Ep.#  5/20 : 100%|██████████| 1000/1000 [01:35<00:00, 10.42it/s, l=89.84 p=  0.01151 val= -2871 bl=   -17 |g|=120.7]
Ep.#  6/20 : 100%|██████████| 1000/1000 [01:35<00:00, 10.42it/s, l=89.44 p= 0.009129 val= -2889 bl=-17.54 |g|=171.3]
Ep.#  7/20 : 100%|██████████| 1000/1000 [01:37<00:00, 10.29it/s, l=98.99 p=  0.01476 val= -2969 bl=-19.08 |g|=138]
Ep.#  8/20 : 100%|██████████| 1000/1000 [01:37<00:00, 10.23it/s, l=107.8 p= 0.009697 val= -3064 bl= -19.3 |g|=165.5]
Ep.#  9/20 : 100%|██████████| 1000/1000 [01:39<00:00, 10.08it/s, l=92.96 p= 0.007804 val= -2872 bl=-19.48 |g|=126.1]
Ep.# 10/20 : 100%|██████████| 1000/1000 [01:41<00:00,  9.81it/s, l=90.63 p= 0.006603 val= -3014 bl=-19.52 |g|=119.8]
Ep.# 11/20 : 100%|██████████| 1000/1000 [01:41<00:00,  9.82it/s, l=92.18 p= 0.009259 val= -2945 bl=-18.77 |g|=152.1]
Ep.# 12/20 : 100%|██████████| 1000/1000 [01:39<00:00, 10.08it/s, l=113.5 p= 0.002143 val= -3051 bl=-18.64 |g|=182.8]
Ep.# 13/20 : 100%|██████████| 1000/1000 [01:36<00:00, 10.38it/s, l=103.7 p=  0.01128 val= -3032 bl=-17.91 |g|=177]
Ep.# 14/20 : 100%|██████████| 1000/1000 [01:37<00:00, 10.30it/s, l=115.1 p= 0.004059 val= -2999 bl=-18.97 |g|=145.4]
Ep.# 15/20 : 100%|██████████| 1000/1000 [01:40<00:00,  9.94it/s, l=128.6 p=0.0007742 val= -3113 bl=-19.28 |g|=114.1]
Ep.# 16/20 : 100%|██████████| 1000/1000 [01:38<00:00, 10.15it/s, l=112.8 p= 0.002212 val= -3010 bl=-19.14 |g|=143.5]
Ep.# 17/20 : 100%|██████████| 1000/1000 [01:35<00:00, 10.47it/s, l=126.4 p= 0.001533 val= -3011 bl=-17.95 |g|=164.2]
Ep.# 18/20 : 100%|██████████| 1000/1000 [01:36<00:00, 10.39it/s, l=98.85 p= 0.006564 val= -3054 bl=   -20 |g|=146.4]
Ep.# 19/20 : 100%|██████████| 1000/1000 [01:35<00:00, 10.45it/s, l=94.07 p= 0.009258 val= -3075 bl=-18.17 |g|=86.24]
Ep.# 20/20 : 100%|██████████| 1000/1000 [01:35<00:00, 10.45it/s, l=93.41 p= 0.006309 val= -2901 bl=   -18 |g|=92.53]

改进:1.增加modelsize

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值