Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlltoAll for large problem #1095

Open
mloubout opened this issue Jan 25, 2020 · 10 comments
Open

AlltoAll for large problem #1095

mloubout opened this issue Jan 25, 2020 · 10 comments
Assignees

Comments

@mloubout
Copy link
Contributor

The AlltoAll calls for MPI make Devito crash for large problems.

@FabioLuporini
Copy link
Contributor

where (what python line) and reproducer

@FabioLuporini
Copy link
Contributor

FabioLuporini commented Jan 25, 2020

error trace too would be nice

@mloubout
Copy link
Contributor Author

  File "/usr/local/lib/python3.6/dist-packages/devito/operator/operator.py", line 520, in arguments
    args = self._prepare_arguments(**kwargs)
  File "/usr/local/lib/python3.6/dist-packages/devito/operator/operator.py", line 419, in _prepare_arguments
    args.update(p._arg_values(**kwargs))
  File "/usr/local/lib/python3.6/dist-packages/devito/types/sparse.py", line 287, in _arg_values
    values = new._arg_defaults(alias=self).reduce_all()
  File "/usr/local/lib/python3.6/dist-packages/devito/tools/memoization.py", line 91, in __call__
    res = cache[key] = self.func(*args, **kw)
  File "/usr/local/lib/python3.6/dist-packages/devito/types/sparse.py", line 267, in _arg_defaults
    for k, v in self._dist_scatter().items():
  File "/usr/local/lib/python3.6/dist-packages/devito/types/sparse.py", line 821, in _dist_scatter
    [scattered, rcount, rdisp, mpitype])
  File "mpi4py/MPI/Comm.pyx", line 676, in mpi4py.MPI.Comm.Alltoallv
  File "mpi4py/MPI/msgbuffer.pxi", line 592, in mpi4py.MPI._p_msg_cco.for_alltoall
  File "mpi4py/MPI/msgbuffer.pxi", line 456, in mpi4py.MPI._p_msg_cco.for_cco_recv
  File "mpi4py/MPI/msgbuffer.pxi", line 300, in mpi4py.MPI.message_vector
  File "mpi4py/MPI/asarray.pxi", line 22, in mpi4py.MPI.chkarray
  File "mpi4py/MPI/asarray.pxi", line 15, in mpi4py.MPI.getarray
OverflowError: value too large to convert to int

@FabioLuporini
Copy link
Contributor

command line to reproduce ? can you write an MFE? this seems to be due to the data distribution of SparseFunctions.

@mloubout
Copy link
Contributor Author

command line to reproduce ? can you write an MFE?

Not really, just add a massive number of receivers in any example and at some point will crash like that. All examples are setup for tiny number of receivers so wouldn't pop up

@FabioLuporini
Copy link
Contributor

Not really, just add a massive number of receivers in any example

so we should be able to write a 5-6 lines MFE. I'll try to reproduce

@FabioLuporini
Copy link
Contributor

can we close this? @mloubout

@mloubout
Copy link
Contributor Author

mloubout commented Feb 6, 2020

No, the PRs improved the set-up time for larger receivers (still issues with full size 3D I trying to track) but this error is not related, this is due to message size so will happen, trying to find a fix for that too

@ggorman
Copy link
Contributor

ggorman commented Feb 7, 2020

@mloubout - you should not be expecting to see integer OverflowError unless you are running in the order of a couple of billion dof's. How large is your problem? If it really is that big then we have to ensure our indexing supports int64.

@mloubout
Copy link
Contributor Author

mloubout commented Feb 7, 2020

s you are running in the order of a couple of billion dof's

You don't need to go that big to be way over that. 3D receivers, OBN setup with reciprocity:

  • ~2M rec positions
    ~ 10-20k time steps

And you have couple tens of billions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants