Key error while running the code #5

Thiyagu1985 · 2020-07-22T12:29:27Z

I am working on CPU+GPU .I got error while executing this command line

python train.py new -af par/arch.basic.json -tf par/train.basic.json -nb 4 -si 1000
-vqn 1000 $run_dir/model%.ckpt $run_dir/librispeech.dev-clean.dat $run_dir/data_slices.dat

"KeyError: '/root/data_mtbox_d/VQ_VAE/VQ_VAE_Speech/ae_wavenet/model%.ckpt'". Can you help me to resolve it

WouterBesse · 2023-05-24T09:45:43Z

Hey Thiyagu! I ran into the same problem.

The problem seems to be happening in the setup_hparams() function, specifically line 27 hparam_sets = [HPARAMS_REGISTRY[x.strip()] for x in hparam_set_names if x] + [kwargs].
I tried to fix the functionality of this line, but unfortunately I don't really get what it's trying to achieve.

I did, however, seem to bypass the problem by simply commenting out some lines in this function. So far this doesn't seem to cause many problems, I got no problems starting to train.

The function now looks as follows:

def setup_hparams(hparam_set_names, kwargs):
    H = Hyperparams()
    if not isinstance(hparam_set_names, tuple):
        hparam_set_names = hparam_set_names.split(",")
    // hparam_sets = [HPARAMS_REGISTRY[x.strip()] for x in hparam_set_names if x] + [kwargs]
    for k, v in DEFAULTS.items():
        H.update(v)
    //for hps in hparam_sets:
    //    for k in hps:
    //        if k not in H:
    //            raise ValueError(f"{k} not in default args")
    //    H.update(**hps)
    H.update(**kwargs)
    return H

With this, I'd like to add that there are also some changes I needed to make in the terminal arguments to make this work.
The command now looks like this:
python3 train.py --new -af par/arch.basic.json -tf par/train.basic.json -nb 4 -si 1000 -vqn 1000 --ckpt_template $run_dir/model%.ckpt $run_dir/model%.ckpt $run_dir/model%.ckpt -hw 'GPU'

Note: I changed the hardware to GPU because I don't use a TPU, I changed new to --new, and I added the --ckpt_template flag before the %.ckpt location.

I'd recommend checking out if, when training, the checkpoints still get stored properly though. I didn't get this far because I ran into CUDA problems, which is my own little problem hahaha. But we are messing with some setup with the checkpoint location.

Let me know if you run into any problems when using this hotfix!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key error while running the code #5

Key error while running the code #5

Thiyagu1985 commented Jul 22, 2020 •

edited

Loading

WouterBesse commented May 24, 2023

Key error while running the code #5

Key error while running the code #5

Comments

Thiyagu1985 commented Jul 22, 2020 • edited Loading

WouterBesse commented May 24, 2023

Thiyagu1985 commented Jul 22, 2020 •

edited

Loading