can't reproduce the preprocessed data #10

quynhneo · 2020-11-19T00:31:45Z

Hi there,
I ran https://github.com/adjidieng/DETM/blob/master/scripts/data_undebates.py on the kaggle data for un debates (as link in your paper: https://www.kaggle.com/unitednations/un-general-debates) but I am unable to reproduce the preprocessed data you linked here https://bitbucket.org/franrruiz/data_undebates_largev/src/master/ (variables in .mat files are different from yours) .
Any idea? There is not much setting beside min_df and max_df. I used the default, perhaps you used something else?

mona-timmermann · 2020-11-24T12:44:19Z

Might be too obvious, but could it just be because of the random permutation with no seed? Apart from that, I've observed a lot of things I had to change in the code to get it to run and to implement the model as described in the paper. I was never able to reproduce the results using the original code.

quynhneo · 2020-11-24T16:49:31Z

hm...possibly. Same here on having to change a lot. Perhaps we should submit some PRs.

Emekaborisama · 2021-01-05T08:42:50Z

Let's work on converting it to a python library @quynhneo @mona-timmermann

What do you think?

Although I notice a new error that occurs on a large dataset

quynhneo · 2021-01-05T09:14:30Z

Not a bad idea ... Ideally we have @adjidieng supports the idea .

Emekaborisama · 2021-01-05T22:25:33Z

I can talk to @adjidieng tomorrow and i will keep you in touch with her response

wyt? @mona-timmermann

Emekaborisama · 2021-01-06T10:56:31Z

Adji said we can proceed but we will upload the package as a branch on this repo.
@quynhneo @mona-timmermann lets get this done

yangyijane · 2021-02-03T21:21:56Z

@Emekaborisama Hi any updates on the python script to reproduce this study? thank you very much.

yangyijane · 2021-02-04T01:59:12Z

that's cool. thx.

…

On Wed, Feb 3, 2021 at 4:47 PM Quynh M. Nguyen ***@***.***> wrote: I have made it to work, see my fork https://github.com/quynhneo/DETM — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALAUW4ZROIQ2K2VNOQ5ONMDS5G76XANCNFSM4T2WUOAA> .

yangyijane · 2021-02-04T02:07:41Z

Hi Mr Nguyen, I have a follow-up question regarding the script running DETM after you preprocessing all your data. I checked your script and you split the data into training vs testing set. Why did you do that? I thought it is supposed to be unsupervised learning? Thank you very much. On Wed, Feb 3, 2021 at 8:58 PM It’s Jenny’s Wonderland <[email protected]> wrote:

…

that's cool. thx. On Wed, Feb 3, 2021 at 4:47 PM Quynh M. Nguyen ***@***.***> wrote: > I have made it to work, see my fork https://github.com/quynhneo/DETM > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#10 (comment)>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/ALAUW4ZROIQ2K2VNOQ5ONMDS5G76XANCNFSM4T2WUOAA> > . >

quynhneo · 2021-02-10T05:22:25Z

according to the paper, they calculate perplexity using test documents.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't reproduce the preprocessed data #10

can't reproduce the preprocessed data #10

quynhneo commented Nov 19, 2020 •

edited

Loading

mona-timmermann commented Nov 24, 2020

quynhneo commented Nov 24, 2020

Emekaborisama commented Jan 5, 2021

quynhneo commented Jan 5, 2021 •

edited

Loading

Emekaborisama commented Jan 5, 2021

Emekaborisama commented Jan 6, 2021

yangyijane commented Feb 3, 2021

yangyijane commented Feb 4, 2021 via email

yangyijane commented Feb 4, 2021 via email

quynhneo commented Feb 10, 2021

can't reproduce the preprocessed data #10

can't reproduce the preprocessed data #10

Comments

quynhneo commented Nov 19, 2020 • edited Loading

mona-timmermann commented Nov 24, 2020

quynhneo commented Nov 24, 2020

Emekaborisama commented Jan 5, 2021

quynhneo commented Jan 5, 2021 • edited Loading

Emekaborisama commented Jan 5, 2021

Emekaborisama commented Jan 6, 2021

yangyijane commented Feb 3, 2021

yangyijane commented Feb 4, 2021 via email

yangyijane commented Feb 4, 2021 via email

quynhneo commented Feb 10, 2021

quynhneo commented Nov 19, 2020 •

edited

Loading

quynhneo commented Jan 5, 2021 •

edited

Loading