-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can't reproduce the preprocessed data #10
Comments
Might be too obvious, but could it just be because of the random permutation with no seed? Apart from that, I've observed a lot of things I had to change in the code to get it to run and to implement the model as described in the paper. I was never able to reproduce the results using the original code. |
hm...possibly. Same here on having to change a lot. Perhaps we should submit some PRs. |
Let's work on converting it to a python library @quynhneo @mona-timmermann What do you think? Although I notice a new error that occurs on a large dataset |
Not a bad idea ... Ideally we have @adjidieng supports the idea . |
I can talk to @adjidieng tomorrow and i will keep you in touch with her response wyt? @mona-timmermann |
Adji said we can proceed but we will upload the package as a branch on this repo. |
@Emekaborisama Hi any updates on the python script to reproduce this study? thank you very much. |
that's cool. thx.
…On Wed, Feb 3, 2021 at 4:47 PM Quynh M. Nguyen ***@***.***> wrote:
I have made it to work, see my fork https://github.com/quynhneo/DETM
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALAUW4ZROIQ2K2VNOQ5ONMDS5G76XANCNFSM4T2WUOAA>
.
|
Hi Mr Nguyen,
I have a follow-up question regarding the script running DETM after you
preprocessing all your data. I checked your script and you split the data
into training vs testing set.
Why did you do that? I thought it is supposed to be unsupervised learning?
Thank you very much.
On Wed, Feb 3, 2021 at 8:58 PM It’s Jenny’s Wonderland <[email protected]>
wrote:
… that's cool. thx.
On Wed, Feb 3, 2021 at 4:47 PM Quynh M. Nguyen ***@***.***>
wrote:
> I have made it to work, see my fork https://github.com/quynhneo/DETM
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#10 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ALAUW4ZROIQ2K2VNOQ5ONMDS5G76XANCNFSM4T2WUOAA>
> .
>
|
according to the paper, they calculate perplexity using test documents. |
Hi there,
I ran https://github.com/adjidieng/DETM/blob/master/scripts/data_undebates.py on the kaggle data for un debates (as link in your paper: https://www.kaggle.com/unitednations/un-general-debates) but I am unable to reproduce the preprocessed data you linked here https://bitbucket.org/franrruiz/data_undebates_largev/src/master/ (variables in .mat files are different from yours) .
Any idea? There is not much setting beside min_df and max_df. I used the default, perhaps you used something else?
The text was updated successfully, but these errors were encountered: