how to train or ft? #23

dongjicheng · 2024-06-18T16:57:40Z

No description provided.

dongjicheng · 2024-06-18T17:04:44Z

0%| | 0/1250 [00:00<?, ?it/s]loc("/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/mmfreelm-0.1-py3.10.egg/mmfreelm/ops/hgrn/recurrent_fuse.py":105:22): error: 'arith.addf' op requires the same encoding for all operands and results
Traceback (most recent call last):
File "/mnt/jicheng/uniem-main/mmfree/match_entity_number_mmfree.py", line 325, in
loss.backward()
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/torch/autograd/init.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/torch/autograd/function.py", line 288, in apply
return user_fn(self, *args)
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/mmfreelm-0.1-py3.10.egg/mmfreelm/utils.py", line 9, in wrapper
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/mmfreelm-0.1-py3.10.egg/mmfreelm/ops/hgrn/recurrent_fuse.py", line 167, in backward
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 100, in run
timings = {config: self._bench(*args, config=config, **kwargs)
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 100, in
timings = {config: self._bench(*args, config=config, **kwargs)
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 83, in _bench
return do_bench(kernel_call, warmup=self.warmup, rep=self.rep, quantiles=(0.5, 0.2, 0.8))
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/testing.py", line 104, in do_bench
fn()
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 81, in kernel_call
self.fn.run(*args, num_warps=config.num_warps, num_stages=config.num_stages, **current)
File "", line 63, in fused_recurrent_hgrn_bwd_kernel
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/compiler/compiler.py", line 476, in compile
next_module = compile_kernel(module)
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/compiler/compiler.py", line 383, in
lambda src: optimize_ttgir(ttir_to_ttgir(src, num_warps), num_stages, arch))
File "/mnt/anaconda3/envs/tf2/lib/python3.10/site-packages/triton/compiler/compiler.py", line 91, in optimize_ttgir
pm.run(mod)
RuntimeError: PassManager::run failed
0%|

ridgerchu · 2024-06-18T19:31:07Z

Hi, it seems that the triton compiling process failed, are you using CUDA devices to run it?

hsb1995 · 2024-06-27T08:58:39Z

@dongjicheng @ridgerchu Which python file is executed first? What is this parameter set to? Would you be so kind as to say?
Because when I look at the code all I see is a built-in module, there is only a "setup" file and a "generate" file. These two files are not working. I see that you are inquiring about fine-tuning and pre-training, so I would like to ask.

hsb1995 · 2024-06-27T09:04:02Z

@ridgerchu As this project is highly relevant to my research topic, I would like to consult as much as possible. I would like to reproduce it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to train or ft? #23

how to train or ft? #23

dongjicheng commented Jun 18, 2024

dongjicheng commented Jun 18, 2024

ridgerchu commented Jun 18, 2024

hsb1995 commented Jun 27, 2024

hsb1995 commented Jun 27, 2024

how to train or ft? #23

how to train or ft? #23

Comments

dongjicheng commented Jun 18, 2024

dongjicheng commented Jun 18, 2024

ridgerchu commented Jun 18, 2024

hsb1995 commented Jun 27, 2024

hsb1995 commented Jun 27, 2024