Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use end-to-end DGL scripts run featGraph #13

Open
Ed-gong opened this issue May 23, 2022 · 17 comments
Open

use end-to-end DGL scripts run featGraph #13

Ed-gong opened this issue May 23, 2022 · 17 comments

Comments

@Ed-gong
Copy link

Ed-gong commented May 23, 2022

Hi, I want to run the featGraph end-to-end.
I have already built the DGL (with featGraph) and run the test.py file successfully using the instructions posted in https://github.com/dmlc/dgl/tree/master/featgraph.

  • If I want to run an end-to-end GCN training on Pubmed or Reddit dataset, can I just use the DGL GCN benchmark script I have before without changing any kernel names? In other words, which parts of the code of DGL python script do I need to change so that I can run the featGraph(not DGL) end-to-end? Thank you.
@yzh119
Copy link

yzh119 commented May 23, 2022

You might checkout this branch of DGL:

https://github.com/kira-lin/dgl/tree/tvm_integration

@Ed-gong
Copy link
Author

Ed-gong commented May 24, 2022

Thanks for your reply. I just clarified my question by re-editing the post above. Can you respond again? Thank you.

@Ed-gong
Copy link
Author

Ed-gong commented Jun 1, 2022

I used the DGL test scripts to run the GCN on PubMed and Cora dataset with extra one line of code: dgl.sparse._CAPI_FG_LoadModule("../build/featgraph/libfeatgraph_kernels.so") The python script works fine without any error. But the training time of featGraph is the same as DGL. It seems like featGraph does not improve any training time efficiency.

@yzh119
Copy link

yzh119 commented Jun 1, 2022

I don't think Featgraph has better performance against cusparse for GCN on GPU, see table IV in the paper, since DGL uses cusparse, it's normal that you don't observe any acceleration here.

@Ed-gong
Copy link
Author

Ed-gong commented Jun 2, 2022

Thank you very much for your response. I am closing this issue.

@Ed-gong Ed-gong closed this as completed Jun 2, 2022
@yzh119
Copy link

yzh119 commented Jun 7, 2022

Sorry I just noticed that you were using dgl.sparse._CAPI_FG_LoadModule("../build/featgraph/libfeatgraph_kernels.so") to use featgraph as backend, actually the integration was abandoned because TVM do not have native sparse support and we might encounter several issues when used in production, so you will still be using DGL's native backend in most cases even if load the module.

Only the branch I mentioned (https://github.com/kira-lin/dgl/tree/tvm_integration) contains the complete code that uses featgraph backend. Regarding the question in #14 , yes GAT is also supported (it was mentioned in the paper), and we can use it by compiling the tvm_integration branch.

@yzh119
Copy link

yzh119 commented Jun 7, 2022

If you are interested in native sparse support of TVM, our work is coming soon, please stay tuned.

@Ed-gong Ed-gong reopened this Jun 9, 2022
@Ed-gong
Copy link
Author

Ed-gong commented Jun 10, 2022

Hi, thank you for the kind response. For the branch https://github.com/kira-lin/dgl/tree/tvm_integration, If I want to use the featGraph backend, what is the specific python code I needed to write? For example, If I only write dgl.sparse._CAPI_FG_LoadModule("../build/featgraph/libfeatgraph_kernels.so"), will the featGraph backend be used automatically? If not, which python code do I need to use so that the I can use the featGraph GCN and GAT backend ?

The ReadMe file in https://github.com/kira-lin/dgl/tree/tvm_integration/featgraph only shows to run test.py to verify the. correctness. However, the test.py only contains a test case kernel: dgl.sparse._CAPI_FG_SDDMMTreeReduction(gidx, u, v, e) for sddmm kernels. It is a little bit hard for me to know how to run other featGraph kernel backends. Could you provide more detailed instructions about which python code I need to write so that I can use the featGraph GCN and GAT backend kernels? Thank you.

@Ed-gong
Copy link
Author

Ed-gong commented Jun 13, 2022

This is the step we followed:

(base) ygong07@mira0:~/dgl_src/dgl_tvm/dgl/featgraph$ git branch
  master
* tvm_integration
(base) ygong07@mira0:~/dgl_src/dgl_tvm/dgl/build$ pwd
/home/ygong07/dgl_src/dgl_tvm/dgl/build
(base) ygong07@mira0:~/dgl_src/dgl_tvm/dgl/build$ cmake -DUSE_CUDA=ON -DUSE_TVM=ON ..
-- Start configuring project dgl
-- Build with CUDA support
-- Found CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.2
-- Found CUDA_CUDART_LIBRARY=/usr/local/cuda-11.2/lib64/libcudart.so
-- Found CUDA_CUBLAS_LIBRARY=/usr/lib/x86_64-linux-gnu/libcublas.so
-- Found OpenMP_C: -fopenmp  
-- Found OpenMP_CXX: -fopenmp  
-- -fopenmp -O2 -Wall -fPIC -std=c++11  -DUSE_AVX -DIDXTYPEWIDTH=64 -DREALTYPEWIDTH=32
-- Running GPU architecture autodetection
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
-- Found CUDA arch 8.0
-- CUDA flags: -Xcompiler ,-fopenmp,-O2,-Wall,-fPIC,,,-DUSE_AVX,-DIDXTYPEWIDTH=64,-DREALTYPEWIDTH=32;-gencode;arch=compute_80,code=sm_80;--expt-extended-lambda;-Wno-deprecated-declarations;-std=c++14
-- Found OpenMP_C: -fopenmp  
-- Found OpenMP_CXX: -fopenmp  
-- /home/ygong07/dgl_src/dgl_tvm/dgl/third_party/dmlc-core/cmake/build_config.h.in -> include/dmlc/build_config.h
-- Start configuring project featgraph
-- Found CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.2
-- Found CUDA_CUDART_LIBRARY=/usr/local/cuda-11.2/lib64/libcudart.so
-- Found CUDA_CUBLAS_LIBRARY=/usr/lib/x86_64-linux-gnu/libcublas.so
-- /usr/local/cuda-11.2/include
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ygong07/dgl_src/dgl_tvm/dgl/build

(base) ygong07@mira0:~/dgl_src/dgl_tvm/dgl/build$ make -j4
[  1%] Creating featgraph kernels...
[  6%] Built target dmlc
[ 34%] Built target metis
/home/ygong07/tvm/python/tvm/driver/build_module.py:242: UserWarning: target_host parameter is going to be deprecated. Please pass in tvm.target.Target(target, host=target_host) instead.
  warnings.warn(
[ 34%] Built target featgraph_kernel
[ 35%] Built target featgraph_runtime
[ 35%] Linking CXX shared library libdgl.so
[100%] Built target dgl

(base) ygong07@mira0:~/dgl_src/dgl_tvm/dgl/featgraph$ python3 test.py 
Using backend: pytorch
tensor([[[1.5832],
         [1.8842]],

        [[1.1876],
         [2.5858]],

        [[1.5149],
         [0.9924]],
         ...
[[2.2963],
         [1.3279]],

        [[1.7643],
         [1.2339]],

        [[2.3274],
         [1.7878]]], device='cuda:0')

[[[1.5831739]
  [1.8842214]]

 [[1.1875974]
  [2.5857563]]

 [[1.5148897]
  [0.9924001]]
....
[[2.2962904]
  [1.3278971]]

 [[1.7643319]
  [1.233911 ]]

 [[2.3274217]
  [1.7877729]]]

  • We run GCN and GAT scripts using dgl.sparse._CAPI_FG_LoadModule("/home/ygong07/dgl_src/dgl_tvm/dgl/build/featgraph/libfeatgraph_kernels.so")
  • The training time are same as DGL training time
  • Please let us know if you see any issues as these numbers will be reported in a research paper.

Thank you very much for your help.

@yzh119
Copy link

yzh119 commented Jun 20, 2022

Oh sorry, what I mean is the tvm-kernel branch.

@Ed-gong
Copy link
Author

Ed-gong commented Jun 23, 2022

Hi, the tvm-kernel branch you mentioned does not include the 'featGraph' folder. Therefore, I am not sure how to compile it specifically for featgraph and how to verify whether the featgraph is installed correctly or not. Could you provide me with more instructions? Thank you.

@yzh119
Copy link

yzh119 commented Jun 27, 2022

The tvm-kernel branch is fully Python based, and featgraph kernels would be triggered when you set the environment variable DGLENGINE to true.

See https://github.com/kira-lin/dgl/blob/tvm-kernel/python/dgl/sparse.py#L13-L16

@yzh119
Copy link

yzh119 commented Jun 27, 2022

Btw I do think you are not expected to see speedup using featgraph against DGL 0.8 because most of the optimized kernels have already been merged into DGL.

@Ed-gong
Copy link
Author

Ed-gong commented Jul 7, 2022

13 use_tvm = True if 'DGLENGINE' in os.environ and os.getenv('DGLENGINE') == 'tvm' else False
14 if use_tvm:
15     import tvm
16     from .tvm import gsddmm, gspmm

based on line 13, we make sure use_tvm is True, unfortunately, it crashes. When use_tvm is False, it does run, but I suspect it is calling DGL kernels.

We are still interested in running FeatGraph end-to-end. Do let us know if there are any other instructions.

@yzh119
Copy link

yzh119 commented Jul 10, 2022

Would you mind elaborating the error message so that we can debug why crashes?

@Ed-gong
Copy link
Author

Ed-gong commented Jul 23, 2022

Here is what the error I got:


(base) ygong07@mira0:~/compare_graphPy/GraphPy_GPU/build$ python3 GCN_pubmed_dgl.py
Using backend: pytorch
use_tvm True
Output of Read function is 
/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/base.py:45: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  return warnings.warn(message, category=category, stacklevel=1)
graph creation time is: 0:00:00.029156
Traceback (most recent call last):
  File "GCN_pubmed_dgl.py", line 244, in <module>
    logits = net(graph, feature)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "GCN_pubmed_dgl.py", line 193, in forward
    h = self.conv1(g, inputs)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/nn/pytorch/conv/graphconv.py", line 269, in forward
    graph.update_all(fn.copy_src(src='h', out='m'),
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/heterograph.py", line 4499, in update_all
    ndata = core.message_passing(g, message_func, reduce_func, apply_node_func)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/core.py", line 283, in message_passing
    ndata = invoke_gspmm(g, mfunc, rfunc)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/core.py", line 255, in invoke_gspmm
    z = op(graph, x)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/ops/spmm.py", line 171, in func
    return gspmm(g, 'copy_lhs', reduce_op, x, None)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/ops/spmm.py", line 62, in gspmm
    ret = gspmm_internal(g._graph, op,
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/backend/pytorch/sparse.py", line 235, in gspmm
    return GSpMM.apply(gidx, op, reduce_op, lhs_data, rhs_data)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/backend/pytorch/sparse.py", line 64, in forward
    out, (argX, argY) = _gspmm(gidx, op, reduce_op, X, Y)
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/sparse.py", line 87, in _gspmm
    return _gspmm_tvm(gidx, op, reduce_op, u, e) if use_tvm \
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/sparse.py", line 373, in _gspmm_tvm
    mod = gspmm.spmm(
  File "/home/ygong07/anaconda3/lib/python3.8/site-packages/dgl-0.6-py3.8-linux-x86_64.egg/dgl/tvm/gspmm.py", line 301, in spmm
    if topi.util.get_const_int(topi.util.prod(out.shape[1:])) < 16:
AttributeError: module 'tvm.topi' has no attribute 'util'

@yzh119
Copy link

yzh119 commented Jul 24, 2022

This is due to the TVM version, you should use TVM 0.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants