-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NewCclCommMgr on runtime #10617
base: master
Are you sure you want to change the base?
NewCclCommMgr on runtime #10617
Conversation
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally. |
View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10617/ |
Speed stats:
|
View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10617/ |
Speed stats:
|
问题描述:
import oneflow时会创建一些全局对象,其中下面段落创建了
NewCclCommMgr
对象。https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/core/job/env_global_objects_scope.cpp#L200-L208
不过当使用非CUDA设备时,无法创建
NewCclCommMgr
对象,在runtime阶段带来一些问题。这里的改动,就是在runtime阶段也尝试创建
NewCclCommMgr
对象。注:
NewCclCommMgr
对象的创建,依赖RegisterEagerCclCommMgrType
没有解决的问题:在runtime之前,
NewCclCommMgr
对象仅仅用于InsertNcclLogicalOpPass
,我觉得,并不是必须的,或许就可以把NewCclCommMgr对象的创建放到后面,不过涉及的范围比较大,怕有其他副作用。