Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scatter_add #64

Open
ghost opened this issue May 1, 2020 · 10 comments
Open

scatter_add #64

ghost opened this issue May 1, 2020 · 10 comments

Comments

@ghost
Copy link

ghost commented May 1, 2020

In scatter_add function "source" parameter is not working , "src" should be used instead of "source".

@kailashkarthik9
Copy link

It depends on what version of PyTorch you are using. According to the Readme, if you use 0.4.0, the scatter_add method works fine.

But unfortunately, 0.4.0 is not supported on the new GPUs and thus this change is needed.

I personally use GCP and the code was working fine on a K80 GPU. Recently, I shifted to a T4 GPU because of resource availability issues and encountered the same error as you when I shifted to a more recent version of PyTorch.

@segsev
Copy link

segsev commented May 15, 2020

Hi @kailashkarthik9 , Did you try decoding summaries? I have trained my own model but I am facing difficulties while decoding the summaries.
I am running below command for evaluating full model.
python eval_full_model.py --meteor --decode_dir='/home/ajay/Desktop/new/cnn-dailymail/finished_files/decoded_files/test'

Error: Invalid or corrupt jarfile /home/ajay/Desktop/meteor
Traceback (most recent call last):
  File "eval_full_model.py", line 53, in
    main(args)
  File "eval_full_model.py", line 31, in main
    output = eval_meteor(dec_pattern, dec_dir, ref_pattern, ref_dir)
  File "/home/ajay/Desktop/new/fast_abs_rl/evaluate.py", line 70, in eval_meteor
    output = sp.check_output(cmd.split(' '), universal_newlines=True)
  File "/home/ajay/anaconda3/envs/torch1/lib/python3.5/subprocess.py", line 316, in check_output
    **kwargs).stdout
  File "/home/ajay/anaconda3/envs/torch1/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['java', '-Xmx2G', '-jar', '/home/ajay/Desktop/meteor', '/tmp/tmpc_n8dcga/dec.txt', '/tmp/tmpc_n8dcga/ref.txt', '-l', 'en', '-norm']' returned non-zero exit status 1.

Don't know what might have gone wrong, any help would be appreciated.
Although it's showing invalid or Corrupt Jarfile, I have tried with different jarfiles but still its showing same.

@ghost
Copy link
Author

ghost commented May 15, 2020

@segsev do you gave the path to meteor-1.5.jar in environment variable METEOR from ur error it looks like u gave the whole folder as path, try giving path to only the jar file once, named "meteor-1.5.jar".

@segsev
Copy link

segsev commented May 15, 2020

Hi @know-one-1 Thanks for A2A.
I tried providing the path to meteor-1.5.jar but still getting the same error.
python eval_full_model.py --meteor --decode_dir='/home/ajay/Desktop/new/cnn-dailymail/finished_files/decoded_files/val'

Error: Invalid or corrupt jarfile /home/ajay/Desktop/meteor/meteor-1.5.jar
Traceback (most recent call last):
  File "eval_full_model.py", line 53, in
    main(args)
  File "eval_full_model.py", line 31, in main
    output = eval_meteor(dec_pattern, dec_dir, ref_pattern, ref_dir)
  File "/home/ajay/Desktop/new/fast_abs_rl/evaluate.py", line 70, in eval_meteor
    output = sp.check_output(cmd.split(' '), universal_newlines=True)
  File "/home/ajay/anaconda3/envs/torch1/lib/python3.5/subprocess.py", line 316, in check_output
    **kwargs).stdout
  File "/home/ajay/anaconda3/envs/torch1/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['java', '-Xmx2G', '-jar', '/home/ajay/Desktop/meteor/meteor-1.5.jar', '/tmp/tmpxrfhv9_9/dec.txt', '/tmp/tmpxrfhv9_9/ref.txt', '-l', 'en', '-norm']' returned non-zero exit status 1.

Even running meteor without any argument is giving Invalid or corrupt jarfile while ideally it should print the help message. I guess the problem is with the jar file.
Do you have some link for the correct jar file?
Have you tried evaluating the decode summary?

@ghost
Copy link
Author

ghost commented May 15, 2020

@segsev http://www.cs.cmu.edu/~alavie/METEOR/ this is the one i used and i think u also is using the same link, yes i ran my model and evaluated this, i used python 3.7 but i think that doesn't matters, but similar (issue non-zero exit status bcoz of command not getting executed) came while i was evaluating using pyrouge and turned out that pyrouge setup was not correct,

@segsev
Copy link

segsev commented May 15, 2020

@know-one-1 yeah in my case the problem was with jdk version I guess, I was able to run on my mac while the jar was breaking on my linux system. I updated the jdk and it worked. However pyrouge is also throwing some error. How was your pyrouge setup ?

I am using this pyrouge repo https://github.com/andersjo/pyrouge and after running for some 5 minutes, its throwing this error.
No such file or directory: '/Desktop/pyrouge/tools/ROUGE-1.5.5/ROUGE-1.5.5.pl/ROUGE-1.5.5.pl'

my ROUGE setup in bashrc is like that:

export ROUGE=/Desktop/pyrouge/tools/ROUGE-1.5.5/ROUGE-1.5.5.pl

Did you use the same pyrouge repo?

@ghost
Copy link
Author

ghost commented May 15, 2020

no i tried with that but keep getting the error , u can follow https://stackoverflow.com/questions/45894212/installing-pyrouge-gets-error-in-ubuntu this link to setup pyrouge then set the path as required, it worked for me

@kailashkarthik9
Copy link

kailashkarthik9 commented May 16, 2020

My export statement is

export ROUGE="/home/ks3740/pyrouge/tools/ROUGE-1.5.5/"

If you see your error log it is searching for '/Desktop/pyrouge/tools/ROUGE-1.5.5**/ROUGE-1.5.5.pl/**ROUGE-1.5.5.pl'

If you fix the export path to '/Desktop/pyrouge/tools/ROUGE-1.5.5/' it should work hopefully!

@segsev
Copy link

segsev commented May 17, 2020

Managed to fix it, Thanks @kailashkarthik9 @know-one-1

@yueguo-50
Copy link

It depends on what version of PyTorch you are using. According to the Readme, if you use 0.4.0, the scatter_add method works fine.

But unfortunately, 0.4.0 is not supported on the new GPUs and thus this change is needed.

I personally use GCP and the code was working fine on a K80 GPU. Recently, I shifted to a T4 GPU because of resource availability issues and encountered the same error as you when I shifted to a more recent version of PyTorch.

Find copy_summ.py in ./models/. Change source in "source=score.contiguous().view(beam*batch, -1) * copy_prob" and "source=score * copy_prob" to src will work for PyTorch 1.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants