Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental/corbett/jit #225

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from
Open

Experimental/corbett/jit #225

wants to merge 24 commits into from

Conversation

corbett5
Copy link
Collaborator

No description provided.

@corbett5 corbett5 marked this pull request as draft February 23, 2021 01:30
@corbett5
Copy link
Collaborator Author

corbett5 commented Apr 7, 2021

This would be awesome but I think it would require some changes to RAJA
https://docs.nvidia.com/cuda/nvrtc/index.html

For the future.

@wrtobin wrtobin marked this pull request as ready for review September 2, 2021 16:32
Copy link
Contributor

@klevzoff klevzoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty neat! Only thing is I feel like it could be a completely standalone library (it's leaning on some LvArray facilities but not too heavily).

@corbett5
Copy link
Collaborator Author

So there seem to be a few things that are missing. At least the first issue needs to be solved before we make this a requirement for GEOSX.

Pre compiling

We need to be able to pre-compile everything that requires a JIT (the physics kernels) to facilitate testing and systems with jank file systems. This should be done through the JITTI framework so we don't have two different compilation trajectories.
Obviously this will require us to list out all the possible combinations at build time in CMake, but this shouldn't be too hard with some CMake for-loops.

My thinking on this is that the JITTI library would have an executable that would be provided a function and arguments to jit along with whatever other compilation info it needs. It would then check if the function exists and if not it would build it. This would take some work to get the overlap with the main JITTI down, since the executable doesn't have compile time knowledge of the types involved (unlike jitti::Cache which knows the function signature).

Then back in GEOSX CMake land we would just call this executable with the appropriate arguments for each instantiation which we can do with the same command we're using in JITTI to launch the python script that scrapes the compilation info. If we do this correctly the overhead from the executable should be minimal and things will build in parallel.

Simultaneous compilation

There are two problems here. Simultaneous compilation within a single MPI program from multiple ranks, and from different programs. I think at least the first should be solved in the JITTI library, with some optional collective that creates a list of unique instantiations. It would involve at least 2 calls to MPI_Reduce, but I don't think the MPI overhead is going to be an issue at all. If we're smart we could even do it in a manner such that ranks that already have their function jitted don't wait while the others actually sort out who does what. At the moment this isn't a problem in GEOSX because we assume every rank is doing the same thing and just let 0 do the work. I don't think we should rely on this.

Then there's the issue with multiple programs trying to write to the same library. I think the only way to do this would be with some sort of file system lock, but I don't know much about those. Eventually it might not be a bad idea to have a database of jitted functions and their library paths, instead of just organizing them in directories. Then each program interacts with the database to decide what it needs to jit and we let the database handle the concurrency.

@@ -92,6 +96,8 @@ if(ENABLE_CALIPER)

message(STATUS "Using caliper from ${CALIPER_DIR}")

set(FIND_LIBRARY_USE_LIB64_PATHS TRUE)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Also why do we need to blt_register_library for Umpire? I prefer not to do this unless it's necessary (usually only for non-CMake targets).

Comment on lines +35 to +36
#define JITTI_DECL( VARNAME, TYPE, HEADER ) \
constexpr const char VARNAME##Name[] = STRINGIZE( TYPE ); \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this is used in GEOSX, but what exactly is it used for?

@corbett5
Copy link
Collaborator Author

Pretty neat! Only thing is I feel like it could be a completely standalone library (it's leaning on some LvArray facilities but not too heavily).

I agree. I originally only put it here because of the build system and the easy access to the TPLs I wanted to test with. I was thinking that it should be a new component of GEOSX. I'd love for this to be useful enough to pull out into its own library/sub-module, but I'm not sure that's worth doing until it's met the test (fully functioning in GEOSX). Anyways, especially if we add some of the MPI functionality it shouldn't be in LvArray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants