ARM github workflow #66

danikhan632 · 2023-11-28T20:42:36Z

I have been working on triton-shared and have been using ubuntu-arm64, so I wanted to add a workflow to the repo that runs the ARM variant as potential optimizations can be made

A few issues with this current PR that would have to be addressed:

Github doesn't have an arm64 workflow runner, so that’s an issue, in the integration test I'm self hosting but, it would be better if this was hosted
This is going to be an issue for where this will run:

build_and_test_triton_shared_arm:
runs-on: [self-hosted] #going to need a self-hosted ubuntu-arm64 action runner

steps:
- name: Force Failure
...

I'm currently hosting the llvm tar bundle on my own GCP bucket. It's probably a good idea that I don’t self host the llvm tarball; should probably be hosted by someone else

rev = "49af6502"
name = f"llvm-{rev}-{system_suffix}"
if system_suffix == 'ubuntu-arm64': #this is probably not a good idea to merge, but it does work
url = "https://storage.googleapis.com/compiled-blob/llvm-49af6502-ubuntu-arm64.tar.gz"
else:
url = f"https://tritonlang.blob.core.windows.net/llvm-builds/{name}.tar.gz"
return Package("llvm", name, url, "LLVM_INCLUDE_DIRS", "LLVM_LIBRARY_DIR", "LLVM_SYSPATH")

I had to do a bit of an odd work around to change the setup.py for triton, this is really hacky on my part but it seemed like a temporary solution for the runner until hopefully llvm binaries are built and integrated into the setup.py

- name: Build/Install Triton
run: |
export TRITON_CODEGEN_TRITON_SHARED=1
cd triton/python
python3 -m pip install --upgrade pip
python3 -m pip install cmake==3.24
python3 -m pip install ninja
python3 -m pip uninstall -y triton
rm setup.py 
mv ../third_party/triton_shared/.github/arm-setup.py ./setup.py
python3 setup.py build

Working on utilizing Arm MLIR dialects for faster GEMMs so would be appreciated if this could be integrated in some form

manbearian · 2023-11-29T17:36:47Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

danikhan632 · 2023-11-29T19:59:13Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

Thank you so much, Also wanted to ask if pursing using Arm Neon in matmul and other arithmetic operations a good idea; was hoping to add this in for the optimization passes for triton cpu.

manbearian · 2023-11-29T21:29:57Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

Thank you so much, Also wanted to ask if pursing using Arm Neon in matmul and other arithmetic operations a good idea; was hoping to add this in for the optimization passes for triton cpu.

On this, i believe one of my colleagues has actually been looking into this with ARM. He offered to sync with you on this, so stand by.

manbearian · 2023-11-29T21:34:13Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

@danikhan632 ,

First, ARM-hosted LLVM:

Please take a look at triton/.github/workflows/llvm-build.yml at main · openai/triton. If you add ARM64 for Ubuntu to the builds here, then a small change to the default setup.py script should get things working. Do you mind giving that a try? I talked to Phil earlier today and he was onboard with adding ARM support if we can make it work.

Second, regarding ARM64 runners:

i'm working on getting some ARM64 VMs setup under my teams Azure account and will make them available to this triton-shared Github project when they're ready. Not sure how long this will take, but hopefully not more than a few days.

danikhan632 · 2023-11-29T22:15:45Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

@danikhan632 ,

First, ARM-hosted LLVM:

Please take a look at triton/.github/workflows/llvm-build.yml at main · openai/triton. If you add ARM64 for Ubuntu to the builds here, then a small change to the default setup.py script should get things working. Do you mind giving that a try? I talked to Phil earlier today and he was onboard with adding ARM support if we can make it work.

Second, regarding ARM64 runners:

i'm working on getting some ARM64 VMs setup under my teams Azure account and will make them available to this triton-shared Github project when they're ready. Not sure how long this will take, but hopefully not more than a few days.

will give a shot and submit a PR, changing the llvm build workflow to add ARM64 should be pretty easy. The setup.py should also be even easier. Will have this PR done pretty quickly. Thanks for the workflow runners, looking forward to those arm workflow runners.

danikhan632 · 2023-11-29T22:17:14Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

Thank you so much, Also wanted to ask if pursing using Arm Neon in matmul and other arithmetic operations a good idea; was hoping to add this in for the optimization passes for triton cpu.

On this, i believe one of my colleagues has actually been looking into this with ARM. He offered to sync with you on this, so stand by.

would love to sync up on this and hear more from your colleagues

danikhan632 · 2023-11-29T22:37:45Z

A

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

@danikhan632 ,
First, ARM-hosted LLVM:
Please take a look at triton/.github/workflows/llvm-build.yml at main · openai/triton. If you add ARM64 for Ubuntu to the builds here, then a small change to the default setup.py script should get things working. Do you mind giving that a try? I talked to Phil earlier today and he was onboard with adding ARM support if we can make it work.
Second, regarding ARM64 runners:
i'm working on getting some ARM64 VMs setup under my teams Azure account and will make them available to this triton-shared Github project when they're ready. Not sure how long this will take, but hopefully not more than a few days.

will give a shot and submit a PR, changing the llvm build workflow to add ARM64 should be pretty easy. The setup.py should also be even easier. Will have this PR done pretty quickly. Thanks for the workflow runners, looking forward to those arm workflow runners.

Also wanted to ask if cross-compiling arm64 for the LLVM workflow is a good idea or if the workflow runner should just run on ubuntu arm64 to avoid any cross-compiling issues?

      matrix:
        config:
        - {runner: 'Ubuntu 20.04', runs_on: 'ubuntu-20.04', target-os: 'ubuntu', arch: 'x64'}
        - {runner: 'Ubuntu 20.04', runs_on: 'ubuntu-20.04', target-os: 'ubuntu', arch: 'arm64'} #should work
        - {runner: 'CentOS 7', runs_on: ['self-hosted', 'CPU'], target-os: 'centos', arch: 'x64'}
        - {runner: 'MacOS X64', runs_on: 'macos-12', target-os: 'macos', arch: 'x64'}
        - {runner: 'MacOS ARM64', runs_on: 'macos-12', target-os: 'macos', arch: 'arm64'}

manbearian · 2023-11-29T22:53:09Z

A

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

@danikhan632 ,
First, ARM-hosted LLVM:
Please take a look at triton/.github/workflows/llvm-build.yml at main · openai/triton. If you add ARM64 for Ubuntu to the builds here, then a small change to the default setup.py script should get things working. Do you mind giving that a try? I talked to Phil earlier today and he was onboard with adding ARM support if we can make it work.
Second, regarding ARM64 runners:
i'm working on getting some ARM64 VMs setup under my teams Azure account and will make them available to this triton-shared Github project when they're ready. Not sure how long this will take, but hopefully not more than a few days.

will give a shot and submit a PR, changing the llvm build workflow to add ARM64 should be pretty easy. The setup.py should also be even easier. Will have this PR done pretty quickly. Thanks for the workflow runners, looking forward to those arm workflow runners.

Also wanted to ask if cross-compiling arm64 for the LLVM workflow is a good idea or if the workflow runner should just run on ubuntu arm64 to avoid any cross-compiling issues?
      matrix:
        config:
        - {runner: 'Ubuntu 20.04', runs_on: 'ubuntu-20.04', target-os: 'ubuntu', arch: 'x64'}
        - {runner: 'Ubuntu 20.04', runs_on: 'ubuntu-20.04', target-os: 'ubuntu', arch: 'arm64'} #should work
        - {runner: 'CentOS 7', runs_on: ['self-hosted', 'CPU'], target-os: 'centos', arch: 'x64'}
        - {runner: 'MacOS X64', runs_on: 'macos-12', target-os: 'macos', arch: 'x64'}
        - {runner: 'MacOS ARM64', runs_on: 'macos-12', target-os: 'macos', arch: 'arm64'}

I believe cross compiling is the way to go, as that's what we're doing on macos from what i could figure out.

manbearian · 2023-11-29T22:55:43Z

I believe cross compiling is the way to go, as that's what we're doing on macos from what i could figure out.

Also, i think you'll need changes below as the following steps are conditional and i believe ubuntu-arm64 won't match any of them as is.

danikhan632 · 2023-11-30T02:52:47Z

I believe cross compiling is the way to go, as that's what we're doing on macos from what i could figure out.

Also, i think you'll need changes below as the following steps are conditional and i believe ubuntu-arm64 won't match any of them as is.

triton-lang/triton#2726

NathanielMcVicar · 2023-12-01T08:17:03Z

Hi @danikhan632 , this is great to see! Please take a look at #71 and see if you can get your pipelines working on that pool. Once you do, feel free to close that PR (@manbearian will have to manually delete the workflow it created I believe, but for now it's worth keeping up for testing).

danikhan632 · 2023-12-01T16:44:11Z

Hi @danikhan632 , this is great to see! Please take a look at #71 and see if you can get your pipelines working on that pool. Once you do, feel free to close that PR (@manbearian will have to manually delete the workflow it created I believe, but for now it's worth keeping up for testing).

Just commited again with the 1ES arm workflow runner
(EDIT) also just fixed that mistake in the yaml so might need a re-run

danikhan632 · 2023-12-04T20:22:47Z

@danikhan632 this is really amazing to see. Nice work! I'll work with my folks to figure out we can support ARM in our workflows.

Thank you so much, Also wanted to ask if pursing using Arm Neon in matmul and other arithmetic operations a good idea; was hoping to add this in for the optimization passes for triton cpu.

On this, i believe one of my colleagues has actually been looking into this with ARM. He offered to sync with you on this, so stand by.

Hey, hope you had a good weekend, just wanted to ask if there was a way trigger the workflows manually just a bit tedious. Also I have some very rudimentary code for lowering linalg.matmul to ArmSVE/arm dialects and wanted to get in contact with that colleague you mentioned

manbearian · 2023-12-04T22:02:22Z

Hey, hope you had a good weekend, just wanted to ask if there was a way trigger the workflows manually just a bit tedious. Also I have some very rudimentary code for lowering linalg.matmul to ArmSVE/arm dialects and wanted to get in contact with that colleague you mentioned

Hi @danikhan632 ,

First, there's convergence of two things impacting the MS teams bandwidth right now: December vacations + Big internal presentation this week. So please bear with us.

Second, i'm sorry about the annoyance around running the workflow. Since the PR is updating the workflow this requires extra permissions to run after you do this. How much are you changing the workflow each submission? Can we get a version check-in that you can use to test?

danikhan632 · 2023-12-04T22:41:13Z

Hey, hope you had a good weekend, just wanted to ask if there was a way trigger the workflows manually just a bit tedious. Also I have some very rudimentary code for lowering linalg.matmul to ArmSVE/arm dialects and wanted to get in contact with that colleague you mentioned

Hi @danikhan632 ,

First, there's convergence of two things impacting the MS teams bandwidth right now: December vacations + Big internal presentation this week. So please bear with us.

Second, i'm sorry about the annoyance around running the workflow. Since the PR is updating the workflow this requires extra permissions to run after you do this. How much are you changing the workflow each submission? Can we get a version check-in that you can use to test?

No worries everyone is out for the holidays, so understandable. Right now the workflow kept failing because the self-hosted workflow that was setup doesn't have pip installed and I created a bit of an issue. Not sure about the version checkin-part, its up to date with current build.

manbearian · 2023-12-04T23:20:01Z

.github/workflows/test-plugin.yml

+      run: |
+        rm -rf ~/.triton
+
+    - name: install pip


i presume this is an ARM thing... i'll follow to see why this is missing

@NathanielMcVicar following up :)
how does this look to you?

I don't think it's an ARM thing, it's that the self-hosted runners come with a bunch of software pre-installed that's not in the base image, so having to install things as part of the run when using the Marketplace image is expected. We could also use Docker if the install time got out of hand, but for now I think just installing it in the yml (or a setup .sh script) is the best approach.

I think since the runner is self-hosted and perhaps pip wasn't installed so I added that and was probably going to delete it after it ran with it once.

These runners aren't self-hosted in the sense you mean here. These are hosted by the Microsoft 1ES infrastructure (see here https://github.com/apps/1es-resource-management) and work mostly just like the github hosted runners. In particular, they are stateless, so anything like installing pip you will have to do on every run.

https://github.com/microsoft/triton-shared/actions/runs/7067334608/job/19308128182#step:10:91

looks like it doesn't have a C++ compiler either, is it just build-essential that needs to be installed?

@NathanielMcVicar ?

manbearian · 2023-12-04T23:23:06Z

.github/workflows/integration-tests.yml

@@ -5,11 +5,21 @@ on:
    branches: [ "main" ]
  push:
    branches: [ "main" ]
+  workflow_dispatch:


i think the intent is to have test-plugin.yml be the dispatch point... what's this change for?

I think @NathanielMcVicar might have suggested doing that for me to be able to manually trigger the workflows #71 but I could have misconstruded that

That's right, but since it seems like you may not have the permissions it probably doesn't help much. Regardless seems worth allowing for people making changes to these in general (or wanting to test it in their own forks where the permission issues won't exist).

my point is this script just dispatches to the other script, why not just use that one?

manbearian · 2023-12-04T23:23:34Z

.github/workflows/test-plugin.yml

+        python3 -m pip install cmake==3.24
+        python3 -m pip install ninja
+        python3 -m pip uninstall -y triton
+        rm setup.py 


is the plan to get setup.py updated to handle ARM?

this was a workaround for when my triton didn't support ARM, this can be removed when this repo is upto date with the latest build of triton

manbearian

I left a lot of comments. I'm mostly just trying to learn things for now.

manbearian · 2023-12-04T23:24:46Z

.github/workflows/test-plugin.yml

+
+
+
+  build_and_test_triton_shared_arm:


once you have this nailed down, can we refactor and resuse the same steps as above?

absolutely, this is a pretty hacky workaround but I believe with llvm build 66886578, all of this should be easy to get rid of, also I've been working on lowering linalg.matmul to ArmSVE, so really been enjoying some of that work

aaronsm · 2024-01-12T00:46:46Z

Great to see some Arm64 builds :) Here's the workflow I'm using right now.

On MacOS, I follow the instructions on the Triton github for "Install from source" and I comment out the X86, NVPTX and AMDGPU libs in triton/CMakeLists.txt (triton-lang/triton#2922)

For Ubuntu 20.04 and 22.04, I do the same as MacOS and also change triton/python/setup.py to use a native Arm64 build of LLVM (triton-lang/triton#2921).

@danikhan632 shoot me an email aasmith@microsoft to talk more about Arm.

aaronsm

trying to do some cleanup to this PR and run it :)

NathanielMcVicar mentioned this pull request Nov 30, 2023

Test ARM 1ES pool #71

Closed

manbearian reviewed Dec 4, 2023

View reviewed changes

aaronsm approved these changes Jan 24, 2024

View reviewed changes

danikhan632 closed this Feb 5, 2024

danikhan632 force-pushed the main branch from 01ff7e8 to bcd5949 Compare February 5, 2024 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARM github workflow #66

ARM github workflow #66

danikhan632 commented Nov 28, 2023

manbearian commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

manbearian commented Nov 29, 2023

manbearian commented Nov 29, 2023 •

edited

Loading

danikhan632 commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

manbearian commented Nov 29, 2023

manbearian commented Nov 29, 2023

danikhan632 commented Nov 30, 2023

NathanielMcVicar commented Dec 1, 2023

danikhan632 commented Dec 1, 2023 •

edited

Loading

danikhan632 commented Dec 4, 2023

manbearian commented Dec 4, 2023

danikhan632 commented Dec 4, 2023

manbearian Dec 4, 2023

manbearian Dec 4, 2023

NathanielMcVicar Dec 4, 2023

danikhan632 Dec 4, 2023

NathanielMcVicar Dec 4, 2023

danikhan632 Dec 5, 2023

manbearian Dec 5, 2023

manbearian Dec 4, 2023

danikhan632 Dec 4, 2023

NathanielMcVicar Dec 4, 2023

manbearian Dec 5, 2023

manbearian Dec 4, 2023

danikhan632 Dec 4, 2023

manbearian left a comment

manbearian Dec 4, 2023

danikhan632 Dec 4, 2023

aaronsm commented Jan 12, 2024

aaronsm left a comment

ARM github workflow #66

ARM github workflow #66

Conversation

danikhan632 commented Nov 28, 2023

manbearian commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

manbearian commented Nov 29, 2023

manbearian commented Nov 29, 2023 • edited Loading

danikhan632 commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

danikhan632 commented Nov 29, 2023

manbearian commented Nov 29, 2023

manbearian commented Nov 29, 2023

danikhan632 commented Nov 30, 2023

NathanielMcVicar commented Dec 1, 2023

danikhan632 commented Dec 1, 2023 • edited Loading

danikhan632 commented Dec 4, 2023

manbearian commented Dec 4, 2023

danikhan632 commented Dec 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manbearian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronsm commented Jan 12, 2024

aaronsm left a comment

Choose a reason for hiding this comment

manbearian commented Nov 29, 2023 •

edited

Loading

danikhan632 commented Dec 1, 2023 •

edited

Loading