We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
或者有什么办法(或者需要注意修改哪些地方),才能实现解开对deepspeed的依赖呢?
The text was updated successfully, but these errors were encountered:
如果不需要模型并行、zero优化器等技术,sat构造出来的model就可以当作一个正常的pytorch module来用。
from sat import AutoModel model, args = AutoModel.from_pretrained("bert-base-uncased") model = model.cuda() inputs = {'input_ids': torch.LongTensor([[1, 2, 3]]).cuda(), 'position_ids': torch.LongTensor([[0, 1, 2]]).cuda(), 'token_type_ids': torch.LongTensor([[0, 0, 0]]).cuda(), 'attention_mask': torch.LongTensor([[[[1]]]]).cuda()} output = model(**inputs)[0] loss = output.sum() loss.backward() print(loss)
Sorry, something went wrong.
如果不需要模型并行、zero优化器等技术,sat构造出来的model就可以当作一个正常的pytorch module来用。 from sat import AutoModel model, args = AutoModel.from_pretrained("bert-base-uncased") model = model.cuda() inputs = {'input_ids': torch.LongTensor([[1, 2, 3]]).cuda(), 'position_ids': torch.LongTensor([[0, 1, 2]]).cuda(), 'token_type_ids': torch.LongTensor([[0, 0, 0]]).cuda(), 'attention_mask': torch.LongTensor([[[[1]]]]).cuda()} output = model(**inputs)[0] loss = output.sum() loss.backward() print(loss)
不好意思,想请教下。我想把模型参数放到不同的GPU上,比如第一层放在GPU0,另外一层放在GPU1,这个SAT实现了吗?显卡的内存不够,需要分块处理
No branches or pull requests
或者有什么办法(或者需要注意修改哪些地方),才能实现解开对deepspeed的依赖呢?
The text was updated successfully, but these errors were encountered: