Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MUSA: Use Monkey Patching to Automatically Convert CUDA Backend to MUSA #583

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yeahdongcn
Copy link
Contributor

@yeahdongcn yeahdongcn commented Feb 21, 2025

This PR introduces monkey patches to automatically convert CUDA backend to MUSA backend.

Key Updates

  1. util.torch_auto_backend.py
    • Added global variables CUDA and CUDA0 to replace hardcoded "cuda" and "cuda:0".
    • Implemented monkey patching to automatically convert CUDA backend to MUSA.
    • Added test cases for MUSA.

Testing Done

  • make dev_install
  • python ./tests/torch_auto_backend_test.py
    Torch backend loaded: CUDA=musa, CUDA0=musa:0
    musa musa:0
    tensor([1.2000, 2.3000], device='musa:0')
    tensor([1.2000, 2.3000], device='musa:0')
    True
    8
    <torch_musa.core.device.Device object at 0x7ff626246320>
    tensor([1.2000, 2.3000], device='musa:0')
    0
  • Ran inference using (the output tokens still have some problems):
    numactl -N 1 -m 1 python ./ktransformers/local_chat.py --cpu_infer 33 \
      --model_path deepseek-ai/DeepSeek-R1 \
      --gguf_path /models/hub/models--unsloth--DeepSeek-R1-GGUF/snapshots/02bcc0a0f68146dae57942804d82bdf0cc636003/DeepSeek-R1-Q4_K_M \
      --optimize_rule_path /ws/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml

generate_device: str = "cuda",
prefill_device: str = "cuda",
generate_device: str = CUDA,
prefill_device: str = CUDA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefill_device and generate_device is not needed to change.
Change it in your custom yaml file. It's the same elsewhere.

Copy link
Contributor Author

@yeahdongcn yeahdongcn Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to ensure consistency throughout the code (Both CUDA and "cuda" will be present.).
Would you like me to revert this change?

@Atream
Copy link
Contributor

Atream commented Feb 22, 2025

Thank you for your contribution.
We will merge after planning and testing a unified architecture that is compatible with various GPUs. Until then, please use your own branches first. Remember to frequently merge the main branch to stay synchronized with us and achieve better performance.

@yeahdongcn yeahdongcn force-pushed the musa-py branch 2 times, most recently from 89435df to 4792b81 Compare February 25, 2025 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants