-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SphereBijector and AngleBijector #58
Comments
We already have a similar non-bijector to the ones you are proposing in Bijectors.jl for transforming an n-dimensional point to an (n-1)-dimensional simplex. But the way we define it is as a bijector + a projection on the simplex (or a truncation). If the ones you propose are just pure projection operators, perhaps another package might be more appropriate. This might be relevant https://github.com/JuliaNLSolvers/ManifoldProjections.jl. I also wonder why we need to add these functions to Bijectors.jl to use them with the bijectors defined in Bijectors.jl. Can't we just mix them using anonymous functions? If it's |
Fair point. My ultimate goal is to use directional/angular distributions in Turing, at least the two in Distributions, but ideally arbitrary ones. If there's a better way to do that than to add them to Bijectors, I'm happy to do it. |
Isn't this just
Hmm, so I kind of like this idea, but again it's unclear to me how to do it. Seems like just dropping all zeros can do a lot of non-intended stuff, i.e. in the Now, to what you're talking about @sethaxen, this is something I've been thinking about whether or not we should commit to doing, but haven't gotten around to writing the issue (so thank you!). For the The big issue is what to do when you compose these many-to-one and one-to-one mappings. I think we'd have to implement the inverse of the many-to-one as returning a I'm definitively not opposed to this, but it seems like there's quite a bit of work that needs to be done to do this properly 😕 |
Well, it all depends on what we want the transformation for. My assumption due to my need is that we are seeking a transformation If If the transformation is surjective-only, then we need some other way of finding This is exactly the motivation for the spherical case above. Normalization is surjective. We know we can draw uniform samples on a sphere by sampling from a multivariate normal distribution and then normalizing. We can actually draw from any multivariate normal, but we choose as a matter of convenience the standard case. Consequently, the normal kernel on Euclidean space provides the necessary term to build Because However, if the goal is to generate truly invertible transformations for some other purpose, then none of the above applies. However, the above describes the requirements needed for the kinds of distributions used by PPLs like Stan and (I think) would be useful in Turing. Addendum: Some other things we can do with these transformations: We can sample within a bounded volume in Apologies if I've made any notational/terminology errors. |
I don't think it's terribly relevant, but on the off-chance it's useful, here's some early work on adding maps to Manifolds.jl, ultimately with the goal of pushing forward/pulling back objects such as differential forms and distributions: JuliaManifolds/Manifolds.jl#28 |
TL;DR: All in all, I agree with you: I don't think we should enforce transformations to be invertible. Now the question is whether or not we should just add a
I can see this use-case, but it doesn't seem to me like this is a push-forward of a distribution by f? I.e. it doesn't put the same amount of mass on the preimage of f as it does on it's image. Because of this, it seems to me like this wouldn't be case where the change of variables should be used?
👍 for bringing differentiable geometry into the discussion:)
Sorry, what are you referring to when you say "log measure"? In particular, I don't quite understand what "we can augment the log density of π with the log measure to obtain ϕ" means. I understand the overall idea though, and this shouldn't be a problem in the current Bijectors.jl. This would just be a matter of overloading the
So this is kind of what I'm referring to though, no? To find the probability of a particular point on the sphere, you can compute the probability of the corresponding line in R^d wrt. the Gaussian distribution. This is sort of what I'm referring to above, though in this case we're "summing" over an infinite set of points, i.e. integrating.
That is great! Earlier I was considering whether or not we should try to accomodate for potential cases where you are mapping between manifolds rather than just real spaces. Didn't end up doing it, as I was afraid it would be a bit too general for what the purpose of Bijectors.jl is. Also, we don't need to worry about pushing-forward forms and tangents:) I absolutely love the idea though; definitively going to follow your work on this!:) |
Response to comments in #168 .
This is something along the lines of what I hand in mind to ease the transition.
Wait, what does TransformVariables.jl do? Are you referring to the
This can be done, yes. There's a couple of downsides though:
An added benefit to it though, is that we could potentially remove the dimensionality from the definitions of the bijectors, e.g.: struct Batched{T, N}
val::T
end
value(x) = x
value(x::Batched) = x.val
# TODO: implement iterator interface
# General impl
# DOWNSIDE: Means that definitions such as `transform(b::CustomBijector, x)` will be ambiguous.
# (Also, `transform` doesn't exist as of right now, but imagine this to be `(b::Bijector)(x)` for the moment.)
function transform(b::Bijector, xs::Batched)
return Batched(map(b, xs))
end
# Specify further on a per-bijector basis, e.g. `Exp` without dimensionality:
struct Exp end
transform(::Exp, x::AbstractArray) = exp.(x)
transform(::Exp, x::Batched{<:AbstractArray{<:Real}}) = Batched(exp.(value(x)))
logabsdetjac(b::Exp, xs::AbstractArray) = sum(logabsdetjac.(b, xs.val))
function logabsdetjac(b::Exp, xs::Batched{<:AbstractVector{<:Real, N}}) where {N}
# Assume last dimension is batch-dimension
return sum(logabsdetjac.(b, xs.val); dims=N)
end
Are you referring to the current approach or potential "new" approach? At the moment, we make no restrictions on the input and output types beyond the dimensionality; this is why we went with only putting the dimensionality in the bijector, not specific types. This is also why, if we decide to add actual representations of input- and output-spaces beyond just dimensionality, then we need to be really careful as it can easily just make life way more difficult by introducing a bunch of type-instability and whatnot.
100%. But just for the record, I don't think anyone in this discussion thinks that:) This was done because it works really well in most standard cases. The idea was always to move away from it at some point; it was just a matter of time.
We might be able to avoid this with a redo of how we handle batching, but the reason why we did this is because we ran into a bunch of errors which didn't throw errors, e.g. if you mistakenly construct a bijector with the wrong dimensionality and call |
Yes but I don't like the whole |
Yeah, that's why I asked. IMO TransformVariables.jl is a more of a "neat thing" rather than something that's integral to the design of the package, e.g. doesn't have anything to do with compositions of transformations and whatnot. |
What's the current status of this? I've recently been reading about manifold-valued continuous normalizing flows and it appears that they work well without the need to have non-bijective bijectors, see for example here: https://arxiv.org/abs/2006.10254 . They use dynamic trivializations instead. The problem with it is that now instead of a single bijector we have a whole family of them that we need to pick from -- but maybe that's fine? |
I've started working on making the scope of Bijectors.jl a bit wider in #183 by:
So there's progress but it's not quite there yet. This issue is one of the motivations for that PR.
First off, personally I would be very happy to accomodate whatever is required to implement bijectors for manifolds:) I welcome excuses to look at differential geometry, so if there are cool stuff we could do with Bijectors.jl + Manifolds.jl, I'm very keen! I just briefly skimmed parts of the paper and looked at the referenced paper that introduces dynamic trivializations, and AFAIK it seems like this could be defined by Though I'll admit I'm still a bit confused of how one can use retractions in place of the exponential map, I guess one "issue" here is that retractions aren't generally bijective, e.g. the retraction from R^n to S^(n - 1) as mentioned in this issue, hence we'd need support for non-bijective transformations to support the use of retractions in place of exp. EDIT: Nvm, just read the definition of refraction. Knew about the standard topological definition, but couldn't quite see how to define this for |
Oh, nice! I'm going to try soon something along the lines of MODE. Manifolds.jl already has parametrizations so I think the main missing piece is calculation of
Yes, the idea is to just compute in charts and then switch charts when appropriate. I'm not sure what's the right design though -- each part separately is simple enough but they can be put together in many different ways. I'll very likely have some questions when I start implementing this 😉 . I'm wondering a bit how useful these chart bijectors would be in general. We can have them as
Yes, retractions in (a large part of) Riemannian optimization are different from topological retractions, and it often causes confusion. Retractions can be bijective between R^n (or an open subset of R^n) and an open neighborhood of a point. |
Directional statistics deals with unit vectors and periodic variables. Directions.jl includes two such distributions:
VonMises
(angular) andVonMisesFisher
(spherical). I'm planning to implement more (see https://discourse.julialang.org/t/rfc-taking-directional-orientational-statistics-seriously/31951). For these, we need aSphereBijector
and anAngleBijector
.SphereBijector
transforms an n-dimensional vector into an n-dimensional unit vector under the Euclidean norm. It's not really a bijector, since it only has a right inverse (the inclusion function), so its Jacobian has a determinant of 0. However, we can still give alogabsdetjac
term that produces a uniform measure (using a standard multivariate normal kernel). See the Stan manual for details. I also have implementations of this transformation at https://github.com/salilab/HMCUtilities.jl/blob/c4602ac/src/constraint.jl#L469-L527 and tpapp/TransformVariables.jl#67.AngleBijector
simply converts cartesian coordinates of a 1-sphere (circle) to an angle usingatan
. When composed withSphereBijector
andshift
, it provides the necessary transformation forVonMises
. Also composing it withscale
lets one transform any periodic quantity.I'm happy to implement these. But will you take non-bijective functions in Bijectors.jl?
The text was updated successfully, but these errors were encountered: