Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DIPU] _amp_update_scale_算子未对dim=0的tensor做判断处理 #535

Open
Reinerzhou opened this issue Dec 15, 2023 · 0 comments
Open
Labels
DIPU DIPU related

Comments

@Reinerzhou
Copy link
Member

背景

export DIPU_MOCK_CUDA=True
在运行llama_finetune时遇到_amp_update_scale_算子会出现报错。

问题描述

在export DIPU_MOCK_CUDA=True的情况下执行以下代码:
`import torch
import torch_dipu

from torch import tensor

_scale = tensor(65536., device='cuda')
found_inf_combined = tensor(0., device='cuda')
_growth_tracker = tensor(0, device='cuda', dtype=torch.int32)

_growth_factor = 2.0
_backoff_factor = 0.5
_growth_interval = 2000

torch.amp_update_scale(_scale, _growth_tracker, found_inf_combined, _growth_factor, _backoff_factor, _growth_interval)`

会出现错误:
企业微信截图_17026103414188

初步判断是这里的逻辑没有对dim=0的输入tensor做处理:

https://github.com/DeepLink-org/deeplink.framework/blob/16e155d65f2a5e56d703b3e6acf3d9036b5acb1b/dipu/torch_dipu/csrc_dipu/aten/ops/CustomFallbackFunctionsForAmpGradScaler.cpp#L74C1-L103C2

@Reinerzhou Reinerzhou added the DIPU DIPU related label Dec 15, 2023
@Reinerzhou Reinerzhou changed the title [DIPU]_amp_update_scale_算子未对dim=0的tensor做判断处理 [DIPU] _amp_update_scale_算子未对dim=0的tensor做判断处理 Jan 8, 2024
NeosZhang pushed a commit to DeepLink-org/deeplink.framework.dev that referenced this issue Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DIPU DIPU related
Projects
None yet
Development

No branches or pull requests

1 participant