-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projecting Cotangents #286
Comments
In general I am for something like this, for all the reasons noted elsewhere. I do have some concerns though. Suppose I have a primal function |
This is a good point -- perhaps we need a more general piece of functionality that rule-implementers can also hook into that translates between any valid representation of a differential? So if they receive a |
I've sketched an implementation of this proposal here: #306 |
closed by #385 |
ChainRules embraces multiple possible representations of cotangent, for example
AbstractZero
,Composite
, andAbstractArray
are all valid representations for the cotangent of aDiagonal
. However, this flexibility results in an increased burden on rule implementers in that there is in principle no real upper bound on the number of types that one might have to accept as the cotangent w.r.t. the output of some functionfoo
that returns aDiagonal
.I wonder whether some design-orthogonalisation might help to deal with this -- could we separate out the standardisation of the representation of cotangents from the rule implementation?
Consider a function
canonicalise(primal, cotangent)
whose job it is to map a type onto a well-defined, predictable finite set of types for any givenprimal
type. For example, you might implement this as follows forDiagonal
:Note that I've chosen to make the canonical cotangent type for a
Diagonal
aComposite
rather than anAbstractMatrix
for the usual performance related reasons discussed extensively in JuliaDiff/ChainRules.jl#232. AnAbstractMatrix
doesn't count as a "canonical" type in my definition here since it's abstract, so doesn't meet the finiteness criterion.If you did this, then we will certainly be able to avoid defining
+
on so many things -- you just assume that things have beencanonicalise
d before hitting+
. Similarly,Zygote
s automatic constructor pullback generation ought to have an easier time because, if you ensure that everything is appropriately canonicalised, constructors should always receive appropriateNamedTuple
s.@sethaxen pointed out that this is something that we might want to concern ourselves with in #160, but I wanted to raise it separately, as I think it's an interesting thing to consider on its own.
edit: not sure whether we want to choose a different name from
canonicalise
, given that we already have a function with that name. Possibly we could extend it to handle the more general class of things described here.The text was updated successfully, but these errors were encountered: