Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to submit to test server #21

Closed
zhegan27 opened this issue Apr 2, 2019 · 26 comments
Closed

how to submit to test server #21

zhegan27 opened this issue Apr 2, 2019 · 26 comments

Comments

@zhegan27
Copy link

zhegan27 commented Apr 2, 2019

Thanks for this nice repo. I have tried to run experiments on GQA, and it has no problem. After I have trained the model, I did not find the instruction on how I can create the .json file that can be used to submit to the EvalAI test server. Maybe you mentioned it somewhere, but I did not find it. It will be good if you can let me know how such a .json file can be created in order to submit to test server. Thank you!

@dorarad
Copy link
Collaborator

dorarad commented Apr 3, 2019

Thanks for the interest in the dataset! I have to make an update to the data.zip to allow support for that, will do hopefully tonight or tomorrow night and let you know!

(In more detail: I should convert the test/challenge splits to the data1.2.zip files format (they're currently only at the questions zip at visualreasoning.net/download.html), both are jsons but in the GQA website it's a dictionary of all questions while the mac model expects a list, so have to convert dict->list)

@dorarad
Copy link
Collaborator

dorarad commented Apr 5, 2019

I made an update that I believe should solve it and uploaded the new data needed for submission. I will test it later today to make sure it's all working! :)

Please redownload it at: https://nlp.stanford.edu/data/gqa/data1.2s.zip and unzip.
To evaluate over submission questions you'll have to add --submission --getPreds. First flag will use all questions needed for submissions and second will produce a file with all predictions. Also please ignore there the accuracy you get when using "--submission" since it tries to evaluate over questions that don't have answers since they're hidden (have filler answer "NA" instead).

@zhegan27
Copy link
Author

zhegan27 commented Apr 6, 2019

Thank you. I will have a try this weekend and will let you know if it works.

@aleXiehta
Copy link

Is there any example command for submission? I've tried to add --submission and --getPreds but only get train and val predictions. It is very helpful if an example is provided.

@dorarad
Copy link
Collaborator

dorarad commented Apr 8, 2019

you're right either --test or --finalTest is also needed.
the command should be e.g.
python main.py --expName "gqaExperiment" --finalTest --testedNum 10000 --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt
I haven't validated yet my recent commit but it should work I will definitely update as soon as I ensure that and also if you try running it please let me know how it goes!

@yjimmyy
Copy link

yjimmyy commented Apr 10, 2019

I tried running that command but it seems I do not have all_submission_data.json. Where can I download this file?

@dorarad
Copy link
Collaborator

dorarad commented Apr 10, 2019

Sure you can download it in the new version of the data in the readme: https://nlp.stanford.edu/data/gqa/data1.2.zip

@zhegan27
Copy link
Author

I tried to run the command just now. What I get is trainPredictions-gqaExperiment.json and valPredictions-gqaExperiment.json. It seems no other .json files can be found. Then, where to get the .json file to submit to the test server? I am still confused. At least I tried submitting trainPredictions-gqaExperiment.json, which is not correct. Thanks for your help!

@dorarad
Copy link
Collaborator

dorarad commented Apr 10, 2019

sorry about that I should have definitely first checked it myself to make sure things are working fine end-to-end, but can't find enough time right now :/

If you ran the command with --finalTest it should have got to this line https://github.com/stanfordnlp/mac-network/blob/gqa/main.py#L782 where it saves test predictions and also print a message about doing so.

could you please post (or email me if you prefer) the output that you get for the run?

@yjimmyy
Copy link

yjimmyy commented Apr 10, 2019

Sure you can download it in the new version of the data in the readme: https://nlp.stanford.edu/data/gqa/data1.2.zip

I just redownloaded this but I don't see all_submission_data.json

Archive:  data1.2.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
 98417943  2019-03-26 04:04   all_testdev_data.json
3103693442  2019-02-02 19:39   all_train_data.json
436404696  2019-02-02 19:31   all_val_data.json
  6995297  2019-03-26 04:05   balanced_testdev_data.json
202205339  2019-02-02 19:41   balanced_train_data.json
 28330806  2019-02-02 19:40   balanced_val_data.json
---------                     -------
3876047523                     6 files

@dorarad
Copy link
Collaborator

dorarad commented Apr 10, 2019

updated git to point to the new zip with the right file: https://nlp.stanford.edu/data/gqa/data1.2s.zip

@dorarad
Copy link
Collaborator

dorarad commented Apr 12, 2019

Hey @zhegan27 please let me know if your problem has been resolved or if there's anything else I can help you with! If it doesn't work for you will be great if you could post the output of the command you've tried

@dorarad dorarad pinned this issue Apr 12, 2019
@zhegan27
Copy link
Author

sorry for the late response, got busy on other stuff these two days. Yes, running the code has no problem, and the code can print out "Writing predictions..." and then "Done!". My question then is: where is the stored file with all the predictions needed to submit to the test server? What is the name of that file? I somehow cannot find it. Thanks. :)

@dorarad
Copy link
Collaborator

dorarad commented Apr 13, 2019

Oh alright :) actually i believe it should have been in the same directory with val and train predictions: that line shows the path https://github.com/stanfordnlp/mac-network/blob/master/config.py#L85
and maybe a better way to do to see if things work for you is that on that line:
https://github.com/stanfordnlp/mac-network/blob/gqa/preprocess.py#L369
where it saves the predictions, just add e.g. print(sortedPreds) or print(config.predsFile(tier + suffix)) to see both the predictions themselves and the output file it should write them to

@zhegan27
Copy link
Author

Thank you for your quick response! I will let you know whether it works or not today when I find time. :)

@zhegan27
Copy link
Author

Thanks for your help! We have successfully submitted to test server yesterday night. One thing to note is that: At our initial try, after submitting to test server, I got the following error:

Traceback (most recent call last):
File "/code/scripts/workers/submission_worker.py", line 402, in run_submission
submission_metadata=submission_serializer.data,
File "/tmp/tmpx7k6a68w/compute/challenge_data/challenge_225/main.py", line 96, in evaluate
output["result"].append({tier: getScores(questions, questions, predictions, tier, kwargs['submission_metadata']['method_name'])})
File "/tmp/tmpx7k6a68w/compute/challenge_data/challenge_225/main.py", line 315, in getScores
predicted = predictions[qid]
KeyError: '20692199'

Comparing submission_all_questions.json in version 1.2 and all_submission_data.json in version 1.2s, I have found that '20692199' only exist in 1.2, but not in 1.2s. So we need to use submission_all_questions.json in version 1.2 for test server submission.

@dorarad
Copy link
Collaborator

dorarad commented Apr 16, 2019

Alright glad to hear it worked for you with 1.2! You're right about the version will update it accordingly to make it work!

@dorarad dorarad closed this as completed Apr 16, 2019
@kehanlu
Copy link

kehanlu commented Apr 30, 2019

hi, I have some confusions about submitting to the test server.


I unzip the file data1.2s.zip,and use the command below
python main.py --expName "gqaExperiment" --finalTest --test --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt

it seems to test through 930K, 130K, and 130K (three dataset). It is same as the size of balanced_train_set(930K), val_set(130K). i think the third test is submission set but it has only 130K questions.

however, I upload the preds/gqaExperiment/testPredictions-gqaExperiment.json(only 130K answers) to test server. I got the same problem as comment above KeyError: '20692199'

according to @zhegan27 's solution

I took a look at the all_submission_data.json in data1.2s.zip and submission_all_questions.json in question1.2(download from website)

I found it strange that there are 2M questions in all_submission_data.json and 4M questions in submission_all_questions.json

here is my procedure to train & test on the model. Did I make any mistake or there are some bugs in the code?

first, unzip data1.2s.zip and run the training command provided in the readme.
python main.py --expName "gqaExperiment" --train --testedNum 10000 --epochs 25 --netLength 4 @configs/gqa/gqa.txt

then, run the command below and got the problems above...
python main.py --expName "gqaExperiment" --finalTest --test --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt

@dorarad
Copy link
Collaborator

dorarad commented May 1, 2019

Hi, sorry for the problems you experience with that. I definitely believe it will be very useful to update the repo to make the submission smoother (although I don't have enough time to do it until next few weeks :/ ).

To respond to specific things you mentioned:

  • The first attempt that you made to submit didn't work because it evaluated only on the validation set (130k) whereas the server looks for other questions in e.g. test/challenge. That's why you got key error: 20692199.
  • There's no need to worry about the difference between 'submission_all_questions.json' 'all_submission_data.json'. I will make them match to make things more consistent, but it is irrelevant to the server since the smaller one contains all the questions that the server will look for.

Could you please try a new run (like with 1 epoch just to see if it works) where you do:
python main.py --expName "gqaExperiment-NewAttempt" --train --submission --testedNum 10000 --epochs 25 --netLength 4 @configs/gqa/gqa.txt
It will train from scratch but will make sure to create the correct test file to be the submission file, and afterwards if you run
python main.py --expName "gqaExperiment-NewAttempt" --finalTest --test --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt
I believe it should work. Let me know if it solves that!

@kehanlu
Copy link

kehanlu commented May 2, 2019

Thanks for the reply, the problem was solved by myself yesterday. I have submitted to the server successfully.


My experiment: hope to give you some idea

--finalTest --test --submission will test upto 130K question in submission set.
(I create a small submission set for test (eg. 10 questions). It can work as expected (test 10 questions) but the large original set will be upto 130K in one epoch.)

After I ran the command with --testAll. It finally test all questions in submission set.
python main.py --expName "gqaExperiment-NewAttempt" --finalTest --test --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt --testAll


btw, I converted submission_all_questions.json to List in the final successful submit. I think it will also work with all_submission_data.json as you said. 😄

@dorarad
Copy link
Collaborator

dorarad commented May 4, 2019

Hi! That's great news! I'm glad it got solved! :)

@yestinl
Copy link

yestinl commented May 9, 2019

When I run
python main.py --expName "gqaExperiment" --finalTest --test --testAll --getPreds --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt
I got the following problem.

File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 113, in restore
self.op.get_shape().is_fully_defined())
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 219, in assign
validate_shape=validate_shape)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign
use_locking=use_locking, name=name)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1848] rhs shape= [1845]
[[Node: save/Assign_29 = Assign[T=DT_FLOAT, _class=["loc:@macModel/classifier/linearLayerfc_1/biases/bias"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](macModel/classifier/linearLayerfc_1/biases/bias, save/RestoreV2/_59)]]

@dorarad
Copy link
Collaborator

dorarad commented May 9, 2019

Sorry about that! That's a bit of a bug with the vocab what happens in that case is that when you try to run with testdev, if you haven't done that from the beginning where it first loaded the list of all possible words, it then gets a mismatch when you switch from training to testing since there are a 3 new words in vocab of test that didn't appear on train.

The fix is not complicated but unfortunately don't have the time to fix it until about 2 weeks, a possible workaround in the meantime though not ideal will be to retrain e.g.
python main.py --expName "gqaExperiment2" --train --netLength 4 -r --submission @configs/gqa/gqa.txt
and then test
python main.py --expName "gqaExperiment2" --finalTest --test --testAll --getPreds --netLength 4 -r --submission --getPreds @configs/gqa/gqa.txt
And then when it does the original training it will load also all the words from the testdev so that there won't be a mismatch (b/c of the --submission flag).
Hope it helps in the meantime!

@yestinl
Copy link

yestinl commented May 9, 2019 via email

@yestinl
Copy link

yestinl commented May 9, 2019

Sorry, I still encounter the problem after I run the command
python main.py --expName "gqaExperiment2" --train --netLength 4 -r --submission @configs/gqa/gqa.txt

Preprocess data...
load dictionaries
Loading data...
Reading tier train
Reading tier val
Reading tier submission
took 54.67 seconds
Loading word vectors...
loaded embs from file
took 0.01 seconds
Vectorizing data...
took 12.95 seconds
answerWordsNum
1848
took 74.03 seconds
Building model...
ljlkjl False
took 5.05 seconds
2019-05-09 11:03:09.631387: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-05-09 11:03:09.736908: I tensorflow/stream_executor/cuda/cuda_gpu_executor. cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-09 11:03:09.737303: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 392] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6325
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.63GiB
2019-05-09 11:03:09.807489: I tensorflow/stream_executor/cuda/cuda_gpu_executor. cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-09 11:03:09.810482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 392] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6325
pciBusID: 0000:02:00.0
totalMemory: 10.92GiB freeMemory: 9.78GiB
2019-05-09 11:03:09.811239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 471] Adding visible gpu devices: 0, 1
2019-05-09 11:03:10.268220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 52] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-09 11:03:10.268244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 58] 0 1
2019-05-09 11:03:10.268248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 71] 0: N Y
2019-05-09 11:03:10.268251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 71] 1: Y N
2019-05-09 11:03:10.268504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 wit h 10281 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bu s id: 0000:01:00.0, compute capability: 6.1)
2019-05-09 11:03:10.352123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 wit h 9456 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
Restoring epoch 25 and lr 0.0003
Restoring weights
Traceback (most recent call last):
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1322, in _do_call
return fn(*args)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires sh apes of both tensors to match. lhs shape= [512,1848] rhs shape= [512,1845]
[[Node: save/Assign_30 = Assign[T=DT_FLOAT, _class=["loc:@macModel/clas sifier/linearLayerfc_1/weights/weight"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](macModel/classifier/line arLayerfc_1/weights/weight, save/RestoreV2/_61)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 848, in
main()
File "main.py", line 673, in main
epoch = loadWeights(sess, saver, init)
File "main.py", line 208, in loadWeights
saver.restore(sess, config.weightsFile(config.restoreEpoch))
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 1752, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 900, in run
run_metadata_ptr)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1316, in _do_run
run_metadata)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/clie nt/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires sh apes of both tensors to match. lhs shape= [512,1848] rhs shape= [512,1845]
[[Node: save/Assign_30 = Assign[T=DT_FLOAT, _class=["loc:@macModel/clas sifier/linearLayerfc_1/weights/weight"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](macModel/classifier/line arLayerfc_1/weights/weight, save/RestoreV2/_61)]]

Caused by op 'save/Assign_30', defined at:
File "main.py", line 848, in
main()
File "main.py", line 661, in main
savers = setSavers(model)
File "main.py", line 179, in setSavers
saver = tf.train.Saver(max_to_keep = config.weightsToKeep)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 1284, in init
self.build()
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 1296, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 1333, in _build
build_save=build_save, build_restore=build_restore)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 781, in _build_internal
restore_sequentially, reshape)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 422, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/trai ning/saver.py", line 113, in restore
self.op.get_shape().is_fully_defined())
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/ state_ops.py", line 219, in assign
validate_shape=validate_shape)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/ gen_state_ops.py", line 60, in assign
use_locking=use_locking, name=name)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/fram ework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/fram ework/ops.py", line 3414, in create_op
op_def=op_def)
File "/home/ailab/anaconda3/lib/python3.6/site-packages/tensorflow/python/fram ework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected- access

InvalidArgumentError (see above for traceback): Assign requires shapes of both t ensors to match. lhs shape= [512,1848] rhs shape= [512,1845]
[[Node: save/Assign_30 = Assign[T=DT_FLOAT, _class=["loc:@macModel/clas sifier/linearLayerfc_1/weights/weight"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](macModel/classifier/line arLayerfc_1/weights/weight, save/RestoreV2/_61)]]

@dorarad
Copy link
Collaborator

dorarad commented May 9, 2019

oh sorry the first command should be without -r. Like:
python main.py --expName "gqaExperiment2" --train --netLength 4 --submission @configs/gqa/gqa.txt
(because r means resuming previous run and the workaround is to retrain)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants