Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final results aren't saved in --directory when --sample_hyper is "yes" with some --random_seed values and datasets #9

Open
rtrad89 opened this issue Aug 1, 2019 · 0 comments

Comments

@rtrad89
Copy link

rtrad89 commented Aug 1, 2019

I encountered that with some datasets, the final results of the training phase aren't stored under --directory if I use a random_seed of 13712 while hyper-sampling the concentration parameters as well. Only the file state.log would be produced, but not any other output files.

To reproduce the problem:

  1. Download this training corpus from PAN @ CLEF 2017 competition
  2. Run the regular hdp (not the fast variant) on the LDA-C corpus of the fifth training problem set, like:
hdp.exe --data ..\pan17_train\problem005\ldac_corpus.dat --algorithm train --directory ..\output --sample_hyper yes --save_lag -1 --random_seed 13712

(I used gensim to generate the LDA-C corpora)

The program will run smoothly and no error would be raised. However, the output directory would contain only the state.log file and the interim outputs, where we expect also mode.bin, mode-topics.dat and mode-word-assignments.dat. As far as I can tell, the combination of --sample_hyper yes and --random_seed 13712 is causing this fault to occur on selected datasets.

@rtrad89 rtrad89 changed the title Final results isn't saved in -directory when --sample_hyper is "yes" with some random_seed values Final results aren't saved in -directory when --sample_hyper is "yes" with some random_seed values Aug 1, 2019
@rtrad89 rtrad89 changed the title Final results aren't saved in -directory when --sample_hyper is "yes" with some random_seed values Final results aren't saved in --directory when --sample_hyper is "yes" with some --random_seed values and datasets Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant