Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPS support #48

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

MPS support #48

wants to merge 7 commits into from

Conversation

BlackSamorez
Copy link
Collaborator

No description provided.

@BlackSamorez
Copy link
Collaborator Author

Note to self: matmul doesn't work for more than one vector at a time for some reason

@BuildBackBuehler
Copy link
Contributor

Yessss!!! I'm excited about this. I was just looking into what steps would entail. I'm too green (not enough exp./knowledge) to know the ins and outs of building from source/the granular ala C++...but I'm wondering how much could be lifted from other MPS backends.

TVM has MPS support and its own matmul definition. I'd also prefer/suggest if possible as little Torch as possible. Tends to be slow. TVM has its own nn, ops and most of the important formulae like Conv1 + 2 + 3D.

Have you been able to install aqlm[gpu,cpu] or just aqlm[cpu]? Right now Triton's extra module "GPU" is holding me back from the whole shebang. So I was figuring if I can, work on building out the Triton MPS backend. It has support for adding your own backend, but once again, no clue in hell what I'm doing.

Personally, depending on what is gained/lossed, I'll probably just use the aqlm models with MLC-LLM but that also requires some additional mods to get it working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants