Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial Inverses #39

Closed
wants to merge 3 commits into from
Closed

Conversation

ParadaCarleton
Copy link

Adds l_inv and r_inv for left/right inverses.

@@ -2,9 +2,6 @@ name = "InverseFunctions"
uuid = "3587e190-3f89-42d0-90ee-14403ec27112"
version = "0.1.12"

[deps]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be removed. Can you revert this?

export r_inv, l_inv, retraction, coretraction

"""
r_inv(function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use a more descriptive name and also be consistent with inverse. I suggest

Suggested change
r_inv(function)
right_inverse(function)


"""
r_inv(function)
retraction(function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need an alias? I prefer a simple API and would suggest defining only right_inverse.

Suggested change
retraction(function)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, right_inverse and left_inverse would be best, I think.

r_inv(args...; kwargs...) = inverse(args...; kwargs...)

"""
l_inv(function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same hee, I think we should use only the name

Suggested change
l_inv(function)
left_inverse(function)


"""
l_inv(function)
coretraction(function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
coretraction(function)

let trigfuns = ("sin", "cos", "tan", "sec", "csc", "cot")
# regular, degrees, hyperbolic
funcs = (trigfuns..., (trigfuns .* "d")..., (trigfuns .* "h")...)
invfuncs = "a" .* funcs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
invfuncs = "a" .* funcs
invfunc = "a" * func

# regular, degrees, hyperbolic
funcs = (trigfuns..., (trigfuns .* "d")..., (trigfuns .* "h")...)
invfuncs = "a" .* funcs
funcs, invfuncs = Symbol.(funcs), Symbol.(invfuncs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
funcs, invfuncs = Symbol.(funcs), Symbol.(invfuncs)

funcs = (trigfuns..., (trigfuns .* "d")..., (trigfuns .* "h")...)
invfuncs = "a" .* funcs
funcs, invfuncs = Symbol.(funcs), Symbol.(invfuncs)
for (func, invfunc) in zip(funcs, invfuncs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (func, invfunc) in zip(funcs, invfuncs)

invfuncs = "a" .* funcs
funcs, invfuncs = Symbol.(funcs), Symbol.(invfuncs)
for (func, invfunc) in zip(funcs, invfuncs)
@eval l_inv(::typeof($func)) = $invfunc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@eval l_inv(::typeof($func)) = $invfunc
@eval left_inverse(::typeof($func)) = $invfunc

funcs, invfuncs = Symbol.(funcs), Symbol.(invfuncs)
for (func, invfunc) in zip(funcs, invfuncs)
@eval l_inv(::typeof($func)) = $invfunc
@eval r_inv(::typeof($invfunc)) = $func
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@eval r_inv(::typeof($invfunc)) = $func
@eval right_inverse(::typeof($invfunc)) = $func

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Nov 6, 2023

@oschulz thinking about this more, I don't know if this is really a useful interface.

If we consistently apply the rule that functions must be invertible on the whole real number line to have an inverse, we eliminate almost every common invertible function, including most inverses defined in this package. For example, log and exp are a left/right inverse pair, rather than true inverses on the reals (notice how exp(log(-1)) fails). square/sqrt fails for the same reason, multiplication/division fails for x=0, and trig functions fail outside the half-circle where they're defined. The only common function I can think of that's genuinely invertible is addition (which is so trivial I doubt it's used often) and odd powers.

TL;DR: If we want to be strict with math terminology, we'd have to break any code relying on this package and make inverse almost useless.

So, here's three proposals for what a better interface could look like:

  1. Usually, when people say they have an invertible transformation, what they really have is a one-to-one function (injection); a function that can be reverted without losing any information. A user wants to take x and apply f, but only if they know they can apply inverse(f) to get back to x. This is formally called a left inverse or a retraction. So, to minimize code breakage, we could take inverse and redefine it to mean the left inverse. Then, we could define a separate, explicit right_inverse function.
    Downsides: Inconsistent interface between inverse and left/right inverses. This breaks the inverse(inverse(f)) == f invariant (e.g. inverse(log) no longer exists).

  2. Work with branches/preimages/relations instead of inverse functions. Return a function that explicitly forces users to handle all this complexity.
    Downsides: hard to implement, would require reimplementing many functions like, computationally expensive, and a pain in practice.

  3. Define inverse functions in left, right pairs by convention (e.g. inverse(sin) = asin). Then, inverse(left) = right and inverse(right) = left. A separate function returns whether x is a left or right inverse, and another function returns the domain. In the future, we can look at extending this to let users choose the domain they want.
    Pros: Does what I mean. If someone writes inverse(sin), they probably wanted asin, not an error. All code keeps working (only option that breaks 0 dependencies).
    Cons: Inelegant, math nerds complain

@ParadaCarleton
Copy link
Author

Personally I lean towards 3, and providing some way to let users opt-in to stricter behavior.

@oschulz
Copy link
Collaborator

oschulz commented Nov 7, 2023

I have to admit I'm not that happy with 1, 2 or 3, since the break the current approach, which is user-friendly and has worked very well in practice so far.

I would argue that the situation is not quite as problematic as it might seem - the question is just on which domains inverses should be applicable, I think.

The domain of log is the codomain of exp and the domain of exp is the codomain of log, so I would say that it's valid to call them two-sided-inverses of each other. Inverses don't have to be defined over the whole possible value space allowed by the Julia types they operate on - it should be sufficient if they are defined over the whole codomain of the functions that they invert, "almost everywhere". People will certainly consider log and exp to be bijective - which implies that the inverse is unique and so can be understood to be a two-sides inverse (at least for all practical purposes).

The situation is different with x^2 and sqrt, since x^2 is not bijective. This is why we currently define the inverse of sqrt to be InverseFunctions.square, which is restricted to positive/odd values.

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Nov 8, 2023

I would argue that the situation is not quite as problematic as it might seem

From what I can tell, this is what I'm suggesting as option 1. This is the least-breaking option that leaves inverse meaning roughly what it means in math. If inverse(f) doesn't need to be defined on the whole domain, then left_inverse == inverse.

The reason we haven't hit any problems in practice is because implicitly, we've mostly decided that what inverse really means is left_inverse. We're working

the question is just on which domains inverses should be applicable, I think.

If that's the case, we can say the same for asin/acos/etc., which have an inverse on the appropriately-defined domains. sin(asin(x)) == x as long as you limit yourself to abs(x) <= 1.

People will certainly consider log and exp to be bijective

They aren't, since exp isn't a surjection; it maps the reals to the positive reals. This is why exp only has a left inverse, rather than a true inverse.

which implies that the inverse is unique and so can be understood to be a two-sides inverse

That's also why the inverse isn't unique--the behavior of inverse(exp) is unconstrained for x <= 0. (And if we say this is fine because the domain of x is limited, that's also true of asin!)

I think what I've learned from all of this is that mathematicians have a bad habit of saying "invertible" when what they really mean is injective or one-to-one (i.e. left invertible) 😅

@devmotion
Copy link
Member

Your comment is based on the assumption that domain and co-domain should be R. But as @oschulz said, these are not necessarily the most natural (co-)domains and the notion of inverse functions is not restricted to these choices. Just as one example, Wikipedia also mentions that invertibility depends on the choice of the domain and co-domain (the article even discusses the squaring function and the fact that it is invertible if one restricts the domain to non-negative real numbers: https://en.wikipedia.org/wiki/Inverse_function#Squaring_and_square_root_functions). For some choices, only left or right inverses exist but for other choices both of them do exist.

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Nov 8, 2023

Your comment is based on the assumption that domain and co-domain should be R.

My comment is based on the assumption that the domain/codomain rule should behave consistently. Proposal 1 does not assume the domain and codomain should be R; it allows inverse(exp) = log.

I think it makes sense to say "inverse(f) is allowed to be an inverse only on a subset of R," or we can say "inverse(f) has to be an inverse on all of R." Either of these is reasonable behavior, but I think we need to make a decision on which we want to go with and then use it consistently.

@oschulz
Copy link
Collaborator

oschulz commented Nov 8, 2023

I agree with @devmotion. I also strongly disagree with the statement

@ParadaCarleton: They aren't, since exp isn't a surjection; it maps the reals to the positive reals. This is why exp only has a left inverse, rather than a true inverse.

and so would the developers of Bijectors and similar packages, I think.

Functions map between sets - exp maps from the set of reals to the set of positive reals. The "whole real line" isn't a special thing, mathematically. So exp is absolutely a bijection, the codomain of exp is just not the whole set of real numbers.

If that's the case, we can say the same for asin/acos/etc., which have an inverse on the appropriately-defined domains. sin(asin(x)) == x as long as you limit yourself to abs(x) <= 1.

The situation is very different for asin and friends. Yes, if their domains are restricted appropriately they can be bijective. But sin in Julia is not restricted, so it it's not a bijection and so asin is not a two-sided inverse of sin. If the user knows that the values will be restricted, they can use setinverse(sin, asin) to get an invertible sin, but then they also must ensure it's only used on the restricted domain.

Let's take a more complex example:

using BAT, ValueShapes, Distributions, InverseFunctions

mu = BAT.StandardMvUniform(4)

nu = HierarchicalDistribution(
    NamedTupleDist(
        a = Dirichlet([1,2,3]),
        b = Exponential()
    )
) do v
    NamedTupleDist(
        c = Uniform(v.a[1], v.a[1] + 5)
    )
end

x = rand(mu)
f = BAT.DistributionTransform(nu, mu)
y = f(x)

Let's say x is [0.6421642646973026, 0.7743092214585315, 0.7447750771806945, 0.37492143790819443], then y then is (a = [0.084772150447044, 0.20895171857196881, 0.7062761309809872], b = 1.365610072329668, c = 1.9593793399880162) .

Here, nothing lives on the whole real line. The domain of f is the 4-dimensional unit hypercube, and it's codomain - the domain of inverse(f) is quite complex: The domain of y.a is the 3-dimensional simplex, the domain of y.b is the positive reals, and the domain of y.c depends on y.a[1].

We have currently no way of expressing such domains explicitly. Even it we did have a (probably very daunting) system for that, I don't see that could ever be automatically propagated through a composition g ∘ f ∘ h - so if we want function composition to be invertible (and we certainly do), then we can't require that domains have to be specified explicitly as an argument to inverse. We can't even require people to specify the input type (inverse(f, T)), since even type inference through complex chains of functions would be tricky in practice.

What we can and do require is that domains are specified implicitly: inverse(f) must do the right thing on any given input or error. This ensures that users will not get incorrect results.

In the example above, we have inverse(f)(f(x)) ≈ x and f(inverse(f)(y)) == y. So inverse(f) is both a left-inverse and a right-inverse of f. Both f ∘ inverse(f) and inverse(f) ∘ f are identities, the just operate on very different spaces. And I think that's perfectly Ok - otherwise we could never have two-sides inverses of functions that map between different spaces.

The above is not an esoteric example, we do use such constructions in practice reqularly. And in such applications we do use inverse(f) both to the left and the right of f.

Not let's take a very simple example - tuple and only:

only is certainly a left-inverse of tuple: (only ∘ tuple)(x) == x for all x. But while (tuple ∘ only)((5,)) == (5,) we have (tuple ∘ only)(7) != 7. Now, 7 is not in the codomain of tuple - but only doesn't know this, it will happily accept 7 as in input. So if we'd define only to be a right-inverse of tuple, then the correct domain would not be defined implicitly. So I would not define inverse(tuple) = only.

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Nov 8, 2023

What we can and do require is that domains are specified implicitly: inverse(f) must do the right thing on any given input or error. This ensures that users will not get incorrect results.

Thanks, that's roughly what I needed to know when writing the tests. (I'm trying to do this with property-based testing, which requires defining these kinds of properties rigorously.)

One potential problem with this: what if inverse(f)(f(x)) only satisfies this for some methods of f? As an example, what if someone defines an always-positive type, and wants to take the inverse of Base.fix2(^, 2)?

@oschulz
Copy link
Collaborator

oschulz commented Nov 9, 2023

One potential problem with this: what if inverse(f)(f(x)) only satisfies this for some methods of f?

Yes, that's a tricky area. I think so far we've required that inverse(f) yields the correct result for all possible inputs that f and inverse(f) accept without throwing an exception. Maybe we should state this more explicitly in the docs. This also ties in well with the requirement that inverse(inverse(f)) must be (numerically approximately) equivalent to f.

@aplavin
Copy link
Contributor

aplavin commented Nov 25, 2023

I also totally support keeping the strict inverse() meaning.
For example, sometimes I use the generic preimage function:

julia> using IntervalSets, InverseFunctions, Accessors, DataPipes
julia> function preimage(f, image::Interval)
           f⁻¹ = inverse(f)
           eps = f⁻¹.(endpoints(image))
           preimg = @set endpoints(image) = eps
           if eps[2] >= eps[1]
               preimg
           else
               @p begin
                   preimg
                   @modify(reverse, endpoints(__))
                   @modify(reverse, closedendpoints(__))
               end
           end
       end

(not packaged anywhere, just copypaste where needed). It automatically works for all invertible functions, and throws for non-invertible ones. Getting an exception makes it clear that a specific preimage(::typeof(f)) has to be defined.

If we took the stance that "inverse(f) is allowed to be an inverse only on a subset of f domain/range", preimage would silently return an incorrect/incomplete result.

Of course, it doesn't mean that left/right inverse aren't useful, they just need to be explicitly requested.

@oschulz
Copy link
Collaborator

oschulz commented Nov 25, 2023

@aplavin for your preimage example, don't we also need to require the f to be continuous (since then it's guaranteed to be monotonic due to the intermediate value theorem)?

Maybe we should think about a function traits package that provides iscontinuous, ismonotonic and so on? Such traits could be automatically propagated through composed functions, broadcasts, etc., like we do in InverseFunctions now.

@aplavin
Copy link
Contributor

aplavin commented Nov 26, 2023

Indeed, you are totally right, it requires continuity. It's just that I always used it with functions continous on the interval of interest, and there's no way to check for that anyway now.

Maybe we should think about a function traits package that provides iscontinuous, ismonotonic and so on?

I'm all for such a package in principle, not sure how eager others will be with defining these traits. Also islinear btw.

@oschulz
Copy link
Collaborator

oschulz commented Nov 26, 2023

I'm all for such a package in principle, not sure how eager others will be with defining these traits. Also islinear btw.

Yes, islinear is currently only in FlexiMaps, I think? Well, we could just start a package in JuliaMath and ask for community feedback regarding design (boolean or singleton return values, etc.). @devmotion would you also be interested in a "FunctionTraits" (or similar name) package?

@oschulz
Copy link
Collaborator

oschulz commented Jan 2, 2024

I'm closing this for now, until the discussion in #10 has converged.

@oschulz oschulz closed this Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants