Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panzer: MPI decomposition error with multi-block #13717

Open
lebuller opened this issue Jan 9, 2025 · 4 comments
Open

Panzer: MPI decomposition error with multi-block #13717

lebuller opened this issue Jan 9, 2025 · 4 comments
Assignees
Labels
pkg: Panzer type: bug The primary issue is a bug in Trilinos code or tests

Comments

@lebuller
Copy link

lebuller commented Jan 9, 2025

Bug Report

@rppawlo

Description

When trying to setup a simulation with a panzer tool using two separate element blocks with separate initial conditions, an error with the initialization along the boundary occurred. One rank had mainly elements from the first element block with some elements a single layer thick from the second block which were set to an incorrect initial condition. When working to reproduce this error on simpler and simpler meshes, the behavior discussed in the reproduction steps was found. Largely, the issue appears to be that when the MPI decomposition occurs partially along or at least within one element of an internal boundary between two element blocks the initial conditions (and application of boundary conditions) do not occur correctly . The issue has been reproduced with both RCB and kway decomposition).

Steps to Reproduce

  1. run the following cubit journal script to generate a minimal mesh for reproducing this error
    #{ctype = "tri3"}
    #{scheme = "triadvance"}

create surface rectangle width 2 height 1 zplane
split surface 1 across location position -0.05 -2 0 location position -0.05 2 0
merge surface 3 2

set duplicate block elements off
block 1 add surface 3
block 1 name "left"
block 1 element type {ctype}

set duplicate block elements off
block 2 add surface 2
block 2 name "right"
block 2 element type {ctype}

sideset 1 add curve 2 wrt surface 3
sideset 1 name "xmin"
sideset 2 add curve 4 wrt surface 2
sideset 2 name "xmax"
sideset 3 add curve 9 wrt surface 3
sideset 3 add curve 6 wrt surface 2
sideset 3 name "ymin"
sideset 4 add curve 8 wrt surface 3
sideset 4 add curve 7 wrt surface 2
sideset 4 name "ymax"
sideset 5 add curve 5 wrt surface all
sideset 5 name "center"

surface all interval 2
surface all scheme {scheme}
mesh surface all

#{tolower(ctype)}

set exodus netcdf4 on

export mesh "{ctype}-combined-centers.exo" dimension 2 overwrite

  1. run a panzer case with a single mpi rank and with, initializing the "left" and "right" element blocks to different initial conditions and print the initial state to exodus
  2. rerun the case with 2 and 3 mpi ranks and RCB decomposition and print output for these cases.
  3. The results for all 3 cases should be noticeably different on the internal boundary, with the serial case setting the boundary to the "right" block value, the 2 rank case setting the boundary to the average of the two blocks, and the 3 rank case having a differing values along the boundary.
  4. The different behaviors seem to depend on whether the internal sideset is along the MPI boundary, near the MPI boundary (within 1 element length) or not near the MPI boundary.
@lebuller lebuller added the type: bug The primary issue is a bug in Trilinos code or tests label Jan 9, 2025
@rppawlo rppawlo self-assigned this Jan 9, 2025
@rppawlo
Copy link
Contributor

rppawlo commented Jan 9, 2025

If a variable name is the same in both element blocks, since the DOF Manager is using CGFEM, the DOFs on the internal interface are the same DOF for both blocks. In this case, you can't have different values on the interface nodes/faces/edges for each element block. The blocks share the same DOF in the tpetra solution vector. There is definitely an issue - the entire interface should have one value or the other, not a mix. I will try to reproduce next week and get a fix in.

@lebuller
Copy link
Author

Hey Roger @rppawlo, wanted to ping you to see if you have been able to look into reproducing this issue since we talked about it a couple weeks ago?

@rppawlo
Copy link
Contributor

rppawlo commented Feb 10, 2025

Ah yes - I did. Meant to set up a teams meeting, but was on travel last week. Sorry. I was able to reproduce. The current communication is based on ownership of the topological entities of the linear system (Tpetra), so it would not be easy to fix. I'd have to change quite a bit of code for the constant blocks to communicate/stage correctly. This would also require more mpi communication. I think the easiest thing to do is not use the constant block ic/bc, but supply an evaluator that gives the values as a function of the coordinates. I think this is what the other codes built on panzer are doing when they use multiblock. I will test that path out soon to make sure there are no issues on the mesh you supplied.

@lebuller
Copy link
Author

Can we set up a teams meeting to discuss this? It sounds like setting up the evaluator in that way instead of using a constant block value should fix the issues with the initial conditions on the boundary being inconsistent, but I don't think that will fix the issue with boundary conditions being applied on the wrong node when the block boundary and the MPI boundary exist on the same cell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: Panzer type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants