I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. #1

sample-guo · 2023-07-07T02:17:44Z

No description provided.

malkin1729 · 2023-07-11T20:28:30Z

Good question.

Yes, a GFlowNet -- and GFlowNet-EM in particular -- could be used for topic modeling, since a topic model is a particular kind of latent variable model, albeit one with a continuous latent variable. Consider the example of LDA, using the notations $\theta$ for a topic vector and $x$ for a document.

The generative model ($p$) is specified by $p(\theta)$, which is fixed to $\text{Dirichlet}(\alpha)$, and by $p(x\mid\theta)$.
- In LDA, the latter is fully specified by the topic-word matrix $A$, and $\log p(x_i\mid\theta_i)$ can be computed using matrix products and logsoftmax operations (if $x_i$ is represented as a vector of word counts).
- This could take some other parametric form in topic models that do not assume exchangeability (i.e., words in $x_i$ not conditionally independent given $\theta_i$).
The posterior model $q(\theta_i\mid x_i)$ is typically estimated with a Dirichlet in LDA algorithms. However, we can instead train a continuous GFlowNet of some form to sample $\theta_i$, a point in the probability simplex, conditioned on $x_i$.
- This GFlowNet would be trained in the E-step, and the reward for sampling $\theta_i$ given document $x_i$ is $p(\theta_i)p(x_i\mid\theta_i)$.
- How should a GFlowNet generate a vector in the probability simplex over topics? One way is using a stick-breaking process with a mixture of Betas policy (like in the code below), but there are probably other ways.
- In the M-step, one trains the model $p(x_i\mid\theta_i)$ by sampling from the posterior model, $\theta_i\sim q(\theta_i\mid x_i)$, and taking gradient steps on $\log p(x_i\mid\theta_i) + \log p(A)$, where $p(A)$ is the $\text{Dirichlet}(\beta)$ prior on the topic-word matrix.

I have tried this for very small topic models on synthetic data -- you can find the code for a proof of concept at https://gist.github.com/malkin1729/88227a1e451596e1ea1fc7d4e0a7ae09 -- but never pursued it further. Curious about what you can do with it, and particularly about whether topic models with more interesting structure in the latent can benefit from the GFlowNet approach.

sample-guo · 2023-07-13T02:24:45Z

As far as I know, some tree-structured neural topic models and nonparametric forest-structured topic models do utilize the stick-breaking process with a mixture of Betas. I am considering whether GFlowNet can be used as an alternative for modeling in these cases.

Additionally, for certain popular neural topic models based on VAEs, the assumption is often made that the latent variables follow a Gaussian distribution or a logit Gaussian distribution. I am pondering whether continuous-GFlownet theory can be employed as a replacement in these cases.

malkin1729 · 2023-07-13T02:42:09Z

A GFlowNet could be used to sample the posterior over latent topic vectors in nonparametric topic models, indeed. However, I have not seen stick-breaking with mixture of Betas in that literature. Do you have a reference?

In my code, I simply used mixture of Betas in parametrizing the sampling of a point in the probability simplex by sequentially “breaking off” probability mass to assign to a chosen topic.

sample-guo · 2023-07-14T01:30:53Z

Masaru Isonuma, Junichiro Mori, Danushka Bollegala, and Ichiro Sakata. 2020. Tree-structured neural topic model. In ACL, pages 800–806.
Ziye Chen, Cheng Ding, Zusheng Zhang, Yanghui Rao, and Haoran Xie. 2021b. Tree-structured topic modeling with nonparametric neural variational inference. In ACL/IJCNLP, pages 2343–2353.
Zhang Z, Zhang X, Rao Y. Nonparametric Forest-Structured Neural Topic Modeling[C]//Proceedings of the 29th International Conference on Computational Linguistics. 2022: 2585-2597.

I apologize, but I believe these papers are related. If they are combined with Gflownet, do you have any thoughts or suggestions?

malkin1729 · 2023-07-14T01:52:54Z

There is nothing to apologize for. Thank you for the references. I have worked a little on graph-structured topic models and had seen the first paper before, and I quickly looked at the other two now.

They are relevant to structured topic models, of course, but I do not see there the use of learned Beta mixtures as posterior estimators, which is what I had asked about in the comment above.

A GFlowNet could be used as an amortized variational posterior in any of these models.

malkin1729 mentioned this issue Jul 11, 2023

I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. GFNOrg/torchgfn#73

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. #1

I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. #1

sample-guo commented Jul 7, 2023

malkin1729 commented Jul 11, 2023

sample-guo commented Jul 13, 2023

malkin1729 commented Jul 13, 2023 •

edited

Loading

sample-guo commented Jul 14, 2023

malkin1729 commented Jul 14, 2023

I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. #1

I have some questions. Can GFlowNet be used for topic modeling? What are the challenges involved? Discussion is welcome. #1

Comments

sample-guo commented Jul 7, 2023

malkin1729 commented Jul 11, 2023

sample-guo commented Jul 13, 2023

malkin1729 commented Jul 13, 2023 • edited Loading

sample-guo commented Jul 14, 2023

malkin1729 commented Jul 14, 2023

malkin1729 commented Jul 13, 2023 •

edited

Loading