IndexError: too many indices for tensor of dimension 1 #37

nefujiangping · 2019-03-31T15:06:45Z

batcher.py#L119. There is an ERROR when the target summary has one sentence only.

ChenRocks · 2019-04-01T22:16:25Z

Even if there's only one sentence batch_size would be 1 and the function properly handles this. Do you have the correct data format?

nefujiangping · 2019-04-02T09:35:41Z

Thanks for your reply. I follow the instructions to preprocess my own dataset. I am sorry that maybe I didn't discribe the problem clearly.
I focus on the task of headline generation, so most of the summary(namely, headline in my work) is single-sentence summary.
When training the extractor, the decoder input will be [start, tar_ids[:-1]] as the code in function batchify_fn_extract_ptr shows. That is to say, the original target input [[1, 7, 9, 10], [0, 4, 11], ... , [2, 3, 4, 0]] will change to [[1, 7, 9], [0, 4], ... , [2, 3, 4]] after the remove_last operation. Therefore, if the original input is [[1], [0], ... , [2]] shape(32,1), the real input will be [[], [], ... , []] shape(32). Then the IndexError occurs in function pad_batch_tensorize.

ChenRocks · 2019-04-02T23:11:31Z

Thanks for the explanation. I understand the issue now.

If you are working on a dataset that only has 1-sentence summary output this repository might not be the best choice for you. Our work's main improvement is over long-generation (multi-sentence) summarization. My plan for this repo is to only support CNN/DM dataset so I won't change the code for customized datasets. You are free to make a fork and write new codes for your customization. Sorry for the inconveniences. (I think an ad-hoc fix is to pad an extra dummy sentence and mask it out before loss computation.)

However, this could be a bug if we get very unlucky. Say there are some examples in CNN/DM that have 1-sent summary and we sample 1 mini-batch that all of them are 1-sent it will result in error. I will need to check on this. Unfortunately, due to my busy schedule this is a lower priority. In the meanwhile I will keep this issue open for discussion.

nefujiangping · 2019-04-03T02:28:14Z

Thanks for the explanation and advice! I'll have a try.

nickluijtgaarden · 2019-04-30T08:39:29Z

Hey @nefujiangping, I am currently running into the same issue. Have you thought of any solutions for this?

nefujiangping · 2019-04-30T14:55:07Z

Hello @nickluijtgaarden, sorry, I didn't fix this yet (And I don't use this code for now). You can try the solution mentioned above.

jawdat23 · 2019-05-06T13:53:40Z

hey guys,
I believe this will help you to make a headline generator, I tried it myself and it works just fine.
#20 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexError: too many indices for tensor of dimension 1 #37

IndexError: too many indices for tensor of dimension 1 #37

nefujiangping commented Mar 31, 2019

ChenRocks commented Apr 1, 2019

nefujiangping commented Apr 2, 2019 •

edited

Loading

ChenRocks commented Apr 2, 2019

nefujiangping commented Apr 3, 2019

nickluijtgaarden commented Apr 30, 2019

nefujiangping commented Apr 30, 2019

jawdat23 commented May 6, 2019

IndexError: too many indices for tensor of dimension 1 #37

IndexError: too many indices for tensor of dimension 1 #37

Comments

nefujiangping commented Mar 31, 2019

ChenRocks commented Apr 1, 2019

nefujiangping commented Apr 2, 2019 • edited Loading

ChenRocks commented Apr 2, 2019

nefujiangping commented Apr 3, 2019

nickluijtgaarden commented Apr 30, 2019

nefujiangping commented Apr 30, 2019

jawdat23 commented May 6, 2019

nefujiangping commented Apr 2, 2019 •

edited

Loading