Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've become familiar with PyTorch recently because of writing https://github.com/hsf-training/deep-learning-intro-for-hep/
I've also been looking at the Vector documentation because I think it needs an overhaul to be more physicist-friendly. Along the way, I noticed that there's no PyTorch backend yet, but it would be really useful to have one. Vector's approach to NumPy arrays is to expect them to be structured arrays, but feature vectors in an ML model are always unstructured. (Note: there's a conversion function: np.lib.recfunctions.structured_to_unstructured.)
Generally, feature vectors in an ML model will have a few indexes corresponding to vector coordinates and many others that don't. If the first 4 features are$p_T$ , $\eta$ , $\phi$ , and mass, we might want to denote that with
pt_index=0, phi_index=2, eta_index=1, mass_index=3
in such a way that they can be picked out of a tensor namedfeatures
likeIt would be nice if the
features
vector was a subclass oftorch.Tensor
that produces the above viaAnd then if someone asks for
it would compute$p_z$ using the appropriate compute function. With
torch
as thelib
argument of thevector._compute
functions, they would all be autodiffed and could be used in an optimization procedure with backpropagation. The library functions thatvector._compute
needs,vector/tests/test_compute_features.py
Lines 357 to 380 in 7cd311d
are all defined in the
torch
module:so they probably don't even need a shim (which SymPy needed).
Below is the start of an implementation, using https://pytorch.org/docs/stable/notes/extending.html#extending-torch-python-api as a guide. PyTorch defines a
__torch_function__
method (see this investigation), making it possible to overload without even creating real subclasses oftorch.Tensor
, but I think it's a good idea to make subclasses oftorch.Tensor
because these are mostly-normal feature vectors: they just have a few extra properties and methods.But then I got to the point where I'd have to wrap all of the functions and remembered that that's where all of the complexity is. Some functions (possibly methods or properties) take 1 input vectors and return a non-vector, others return a vector, while some other functions take 2 input vectors with both kinds of output, I don't think there are any functions that take more than 2, but there are some functions that don't do anything to the vector properties, like a PyTorch function to move data to and from the GPU or change its dtype. (Possible simplification: maybe all vector components can be forced to be float32?)
Some of the functions will have to shuffle the indexes to make them line up. Say, for instance, that you have
featuresA
withx_index=0, y_index=1
andfeaturesB
withx_index=4, y_index=2
. When you addfeaturesA + featuresB
, you'll need to passinto the
vector._compute.planar.add.dispatch
function.So that's where I left the implementation, as a sketch of the idea of interpreting the
axis=-1
dimension of feature arrays as vector components, passingtorch
as the compute functions'lib
. Considering that each of the different types of functions has to be handled differently before calling compute functions, this is not as easy as I thought (a one-day project), but it's still not a huge project. I'd also like to find out if there's a "market" for this backend: I had assumed that spatial and momentum vector calculations would be useful as (the first) part of an ML model, but I wonder if anyone has any known use-cases.Also, I have to say that the ML "vector" and "tensor" terminology is incredibly confusing in this context. When we say that a feature-set has 2D, 3D, or 4D spatial or momentum vector components, we have to be sure to not call that feature-set a "feature vector," since that's a different thing.