Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare against Janner's approach #51

Open
hongkai-dai opened this issue Apr 28, 2023 · 0 comments
Open

Compare against Janner's approach #51

hongkai-dai opened this issue Apr 28, 2023 · 0 comments

Comments

@hongkai-dai
Copy link
Collaborator

I think when we use the "direct collocation" formulation

min c(x₁, ..., xₙ, u₁, ..., uₙ) + β log p(xᵢ, uᵢ, xᵢ₊₁)

although it looks like Janner's approach in this objective function, in practice our approach is easier, for the following reason:

In Janner's approach, they need to train a classifier to guide the diffusion process. Note that they cannot use the cost function exp(c(x₁ᵗ, ..., xₙᵗ, u₁ᵗ, ..., uₙᵗ)) as this guided classifier directly, but have to train a separate classifier. (Here I use the superscript t on x₁ᵗ to denote it is the t step of denoising, not the t-step in planning horizon). The reason is that during the denoising stage, the trajectory x₁ᵗ, ..., xₙᵗ, u₁ᵗ, ..., uₙᵗ contains a lot of noise, and what the classifier wants to predict is the probability of the denoised trajectory x₁⁰, ..., xₙ⁰, u₁⁰, ..., uₙ⁰ being optimal, not the optimality of the noisy trajectory.
So to train this guided classifier, they start with a no-noise trajectory x₁⁰, ..., xₙ⁰, u₁⁰, ..., uₙ⁰, and then inject noise into for multiple steps, then they pair the matching from the noisy trajectory x₁ᵗ, ..., xₙᵗ, u₁ᵗ, ..., uₙᵗ with target probability exp(c(x₁⁰, ..., xₙ⁰, u₁⁰, ..., uₙ⁰)), and train a classifier model through regression. There is this extra effort to train the classifer, while we just use the cost function c(x₁, ..., xₙ, u₁, ..., uₙ) directly.

This also puts the question that if we should consider a classifier-free planning approach, such as https://arxiv.org/pdf/2211.15657.pdf?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant