1.系统环境
硬件环境(Ascend/GPU/CPU): Ascend
执行模式:静态图 ms2.1
Python版本:3.7
操作系统平台:Linux
2. 报错信息
2.1 问题描述
使用MindSpore开发模型报以下错误:
TypeError: Multiply values for specific argument: query_embeds
复制
2.2 报错信息
Traceback (most recent call last):
File "/a.py", line 95, in <module>
output = model(img_tensor)
File "/root/miniconda3/envs/MindSpore2.1.1py39/lib/python3.9/site-packages/mindspore/nn/cell.py", line 637, in __call__
out = self.compile_and_run(*args, **kwargs)
File "/root/miniconda3/envs/MindSpore2.1.1py39/lib/python3.9/site-packages/mindspore/nn/cell.py", line 961, in compile_and_run
self.compile(*args, **kwargs)
File "/root/miniconda3/envs/MindSpore2.1.1py39/lib/python3.9/site-packages/mindspore/nn/cell.py", line 938, in compile
_cell_graph_executor.compile(self, phase=self.phase,
File "/root/miniconda3/envs/MindSpore2.1.1py39/lib/python3.9/site-packages/mindspore/common/api.py", line 1623, in compile
result = self._graph_executor.compile(obj, args, kwargs, phase, self._use_vm_mode())
TypeError: Multiply values for specific argument: query_embeds
----------------------------------------------------
- Ascend Warning Message:
----------------------------------------------------
W29999: [Init][LoadAscendCustomOppPath] ASCEND_CUSTOM_OPP_PATH[/root/miniconda3/envs/MindSpore2.1.1py39/lib/python3.9/site-packages/mindspore/run_check/../lib/plugin/ascend/custom_aicpu_ops] is not exist or not abstract path.[FUNC:ResolveOpImplPath][FILE:configuration.cc][LINE:683]
----------------------------------------------------
- The Traceback of Net Construct Code:
----------------------------------------------------
The function call stack (See file '/rank_0/om/analyze_fail.ir' for more details. Get instructions about `analyze_fail.ir` at https://www.mindspore.cn/search?inputValue=analyze_fail.ir):
# 0 In file /a.py:75
output = self.qformer(
----------------------------------------------------
- C++ Call Stack: (For framework developers)
----------------------------------------------------
mindspore/core/ir/func_graph_extends.cc:172 GenerateKwParams
复制
报错代码片段如下:
def construct(self, img_tensor: ms.Tensor):
"""
img_tensor: [bs, c, h, w]
"""
img_embeds = self.vmodel(img_tensor)
img_atts = ms.Tensor(np.ones(img_embeds.shape[:-1]), dtype=ms.float32)
output = self.qformer(
query_embeds=self.query_tokens,
encoder_hidden_states=img_embeds,
encoder_attention_mask=img_atts
)
output = self.pangu_proj(output)
return output
复制
3. 根因分析
根据报错信息可以知道我们给query_embeds传了多个值,但是实际上是只需要一个即可,所以我们会首先看下传入的这个 self.query_tokens是不是有问题,我们通过打印分析,self.query_tokens是一个Tensor,不存在多值的情况。
此时我们问题定位不仅仅只关注报错,可能是其他问题诱发这个给人误导性的报错,我们走读脚本发现在报错代码的上面有一句
img_atts = ms.Tensor(np.ones(img_embeds.shape[:-1]), dtype=ms.float32)
复制
此时猜测可能是由于在图模式下,Cell的construct中引用了Numpy。
4. 解决方案
我们将代码做修改
源代码:
img_atts = ms.Tensor(np.ones(img_embeds.shape[:-1]), dtype=ms.float32)
复制
修改后:
img_atts = ms.ops.ones(img_embeds.shape[:-1], ms.float32)
复制
再次执行,模型可以顺利跑通