Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflectorch.data_generation.reflectivity.abeles.abeles doesn't allow you to change the SLD of the ambient medium #13

Open
andyfaff opened this issue Aug 1, 2024 · 5 comments

Comments

@andyfaff
Copy link

andyfaff commented Aug 1, 2024

I've been experimenting using the reflectorch implementation of the abeles calculation, using reflectorch.data_generation.reflectivity.abeles.abeles.

After trial and error (#12 does not outline what the shapes of the input tensors should be) I eventually figured out how to use the abeles function by using the examples given in the unit tests.

In general for an N layer system I would expect there to be (give or take multidimensionality for batching):

  • an array of shape (N,) specifying the thicknesses
  • an array of shape (N+1,) specifying the roughnesses
  • an array of shape (N+2,) specifying the SLDs.

When I look at the examples the array that's expected for sld is (N+1,). This must mean that there is no way of specifying the ambient SLD.

Whilst fine for most air-solid measurements (where SLD_{ambient} == 0), it means the code is not applicable to most solid-liquid and liquid-liquid systems, where the SLD of the ambient/fronting medium is not zero.

This is straightforward to do, see the refnx implementation for a guide. All it involves is subtracting the ambient SLDs from all the other SLDs (i.e. SLD[1:], if SLD_{ambient}=SLD[0]).

@StarostinV
Copy link
Member

Thank you for your comment, Andrew! Indeed, it is very straightforward. I have updated the abeles function in dev branch 442229e, and Valentin will soon push it to main with respective changes in the docstrings etc.

@andyfaff
Copy link
Author

andyfaff commented Aug 6, 2024

I was experimenting with the torch implementation. I found that quite large batch values are needed before the GPU implementation becomes faster than the CPU calc in refnx. What kind of batch sizes do you use during training?

@StarostinV
Copy link
Member

I typically use batch sizes ranging from 4096 to 16384. For tasks like importance sampling or MCMC with PyTorch, it can be increased even further. In general, the size is a power of 2:

$$N = 2^n, n \in [10, 16]$$

Of course, the degree of acceleration depends on GPU. In my usual settings with two layers on top of a substrate and 128 q points, the GPU-accelerated code produces around 1 million curves per second for NVIDIA RTX 2080 Ti.

@andyfaff
Copy link
Author

andyfaff commented Aug 7, 2024

That's pretty cool. I think the fastest batch setting refnx would be able to offer at the moment is a third of that speed (i.e. 3us per curve calculation). That would be on CPU in double precision.

@StarostinV
Copy link
Member

That's pretty good, I don't remember that I could achieve this speed in refnx. Is it the standard API? For instance, the acceleration I get with MCMC is orders of magnitude compared to refnx, and the main cost is reflectometry calculation there.

Of course, apart from the simulation time, there are additional benefits of using PyTorch implementation, such as autograd (e.g. fast score function calculation) and integration with the ML pipeline (sending data to GPU takes quite some time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants