-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential parallel bug with solute transport in v1.5.1? #285
Comments
One of the source of this bug is subcyling. With simple week coupling there is no overshoots (at least on 1 core). I'm looking into this issue |
I looked a little more into this issue and have not been able to link it clearly to anything. However, I do see that the initial velocities show the same striped pattern from some sort of parallel issue. This pattern disappears right away with the Darcy velocities but then little by little appears in the concentrations. Could any of you @levuvietphong or @dasvyat comment whether there could an issue that connected the initial velocities with the concentration hotspots showing up in the early transport. This issue is not a big deal for transport as these differences are small and they are soon overwhelmed by larger concentration differences (like when a front arrives) but when coupled to geochemistry, this leads to problems. For example, it may change mineral or sorbed concentrations which remain like that for the rest of the simulation. |
@smolins , I looked into this issues and for my surprise, it is not a transport issue, in my opinion. The discrepancies are coming directly from flow. I don't know what is the reason at the moment. I've increased nonlinear tolerance with the hope that it can help, but it didn't. I'm looking into this issue. |
Is it in velocities or in fluxes? I would be very surprised if it was in fluxes, but less surprised if it was in velocities/reconstruction, since we rarely look at those. I know that most of transport uses fluxes, but are you using a velocity-dependent dispersion? |
These simulations do no include dispersion (i.e. dispersion not in the input file) and diffusion uses default values (0.0, I assume). |
Ok, then I'm not sure if transport uses velocity anywhere else. I'm not sure of the magnitudes here -- is it possible that this is due to block preconditioners? Does this go away if you use e.g. Boomer AMG? |
I withdraw my previous comment. My comparison was wrong |
Danil's PR (#290) has fixed the stripe pattern in the velocity.2 field at t=0. However, I ran the
|
Did you try to run the transport-only version of the problem?
https://github.com/user-attachments/files/18143452/hillslope_transport_sigmoid_100s.txt
…On Wed, Jan 22, 2025 at 2:30 PM Phong Le ***@***.***> wrote:
Danil's PR (#290 <#290>) has fixed the
stripe pattern in the velocity.2 field at t=0. However, I ran the
hillslope_calcite_crunch_sigmoid.xml example again and I got the problem
of no convergence at a specific cell (see log below). Do you have the same
issue?
Screenshot.2025-01-22.at.5.21.07.PM.png (view on web)
<https://github.com/user-attachments/assets/eb84c7b0-557b-4951-bb15-2d6ede70c900>
Alquimia_PK:domain | no convergence in cell: 1427
reactive transport | Alquimia_PK:domain failed.
surface transport | ----------------------------------------------------------------
surface transport | Advancing: t0 = 252751 t1 = 252841 h = 89.5266
surface transport | ----------------------------------------------------------------
surface transport | 1 sub-cycles, dt_stable=1.49211 min [sec] dt_MPC=1.49211 min [sec]
subsurface transpo | ----------------------------------------------------------------
subsurface transpo | Advancing: t0 = 252751 t1 = 252841 h = 89.5266
subsurface transpo | ----------------------------------------------------------------
inverse::PCG | Converged (relative RHS), itr=1 ||r||=5.6449e-19 ||f||=90.9331
inverse::PCG | Converged (relative RHS), itr=1 ||r||=2.09579e-14 ||f||=27949.6
inverse::PCG | Converged (relative RHS), itr=1 ||r||=1.16336e-14 ||f||=829934
inverse::PCG | Converged (relative RHS), itr=1 ||r||=9.95036e-15 ||f||=1.05614e+07
inverse::PCG | Converged (relative RHS), itr=1 ||r||=9.80417e-22 ||f||=1.05614
subsurface transpo | dispersion solver ||r||=8.50849e-15 itrs=1
subsurface transpo | 1 sub-cycles, dt_stable=7.29735 min [sec] dt_MPC=1.49211 min [sec]
Alquimia_PK:surfac | min/avg/max Newton: 0/0/1, the maximum is in cell 98
No convergence at: 1 1 1
—
Reply to this email directly, view it on GitHub
<#285 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXOTGWQWMS63DQAU2IYCOT2MALWVAVCNFSM6AAAAABTVDY3OKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBYGQYDAOJYGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
You would need to modify any of these files such that time step size is
capped to 70 seconds, not 100 seconds
…On Wed, Jan 22, 2025 at 2:32 PM Sergi Molins Rafa ***@***.***> wrote:
Did you try to run the transport-only version of the problem?
https://github.com/user-attachments/files/18143452/hillslope_transport_sigmoid_100s.txt
On Wed, Jan 22, 2025 at 2:30 PM Phong Le ***@***.***> wrote:
> Danil's PR (#290 <#290>) has fixed the
> stripe pattern in the velocity.2 field at t=0. However, I ran the
> hillslope_calcite_crunch_sigmoid.xml example again and I got the problem
> of no convergence at a specific cell (see log below). Do you have the same
> issue?
>
> Screenshot.2025-01-22.at.5.21.07.PM.png (view on web)
> <https://github.com/user-attachments/assets/eb84c7b0-557b-4951-bb15-2d6ede70c900>
>
> Alquimia_PK:domain | no convergence in cell: 1427
> reactive transport | Alquimia_PK:domain failed.
> surface transport | ----------------------------------------------------------------
> surface transport | Advancing: t0 = 252751 t1 = 252841 h = 89.5266
> surface transport | ----------------------------------------------------------------
> surface transport | 1 sub-cycles, dt_stable=1.49211 min [sec] dt_MPC=1.49211 min [sec]
> subsurface transpo | ----------------------------------------------------------------
> subsurface transpo | Advancing: t0 = 252751 t1 = 252841 h = 89.5266
> subsurface transpo | ----------------------------------------------------------------
> inverse::PCG | Converged (relative RHS), itr=1 ||r||=5.6449e-19 ||f||=90.9331
> inverse::PCG | Converged (relative RHS), itr=1 ||r||=2.09579e-14 ||f||=27949.6
> inverse::PCG | Converged (relative RHS), itr=1 ||r||=1.16336e-14 ||f||=829934
> inverse::PCG | Converged (relative RHS), itr=1 ||r||=9.95036e-15 ||f||=1.05614e+07
> inverse::PCG | Converged (relative RHS), itr=1 ||r||=9.80417e-22 ||f||=1.05614
> subsurface transpo | dispersion solver ||r||=8.50849e-15 itrs=1
> subsurface transpo | 1 sub-cycles, dt_stable=7.29735 min [sec] dt_MPC=1.49211 min [sec]
> Alquimia_PK:surfac | min/avg/max Newton: 0/0/1, the maximum is in cell 98
> No convergence at: 1 1 1
>
> —
> Reply to this email directly, view it on GitHub
> <#285 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADXOTGWQWMS63DQAU2IYCOT2MALWVAVCNFSM6AAAAABTVDY3OKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBYGQYDAOJYGY>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
I ran again the hillslope_transport_sigmoid_100s.txt example:
|
To investigate parallel issue I suggest to switch to weak coupling instead of subcycling to simplify the model. Moreover, something happened between 1.5.1 and current master. When I've added parallel communication to 1.5.1 there were no stripes. After I've merged it with master they have appeared. @levuvietphong can you confirm that using amanzi-1.5.1 and dsv/test_ats-1.5.1 you don't see stripes either. |
@dasvyat: No, using amanzi-1.5.1 and dsv/test_ats-1.5.1 I still see the stripes. This run used 4 cores. |
What are you plotting? And what is the range? |
@dasvyat: I updated the plot. It is the |
@levuvietphong What is about tracer concentration? Darcy velocity is not the best indicator. It is a post-processed quantity which is computed based on face unknown for visualization. After how many days do you plot it? |
Currently, darcy velocity is computed at the initialization stage without parallel update. That's why stripes are observed at t=0. It can be (should be) corrected, but it is not a big deal since during actual AdvanceStep this parallel update is performed and concentration ( and all other fields are updated correctly). The pattern in concentration should not be observed. |
I identified two issues with solute transport in ATS v1.5.1. These issues have been affecting reactive transport but I was able to narrow down to potential issues with transport. I cannot reproduce these issues in a 1D test simulation but it is still unclear to me why these appear only in higher dimensionality simulations. Also, it is unclear whether they are related or not.
The file attached is a transport version of the demo under ats-demos/13_integrated_hydro_reactive_transport/hillslope_calcite_crunch_sigmoid.xml, which is described in Molins et al 2022 WRR. Here it is modified to include only 1 tracer, with initial concentration in the domain = 1 and =0 in the rain water ( hillslope_transport_sigmoid_100s.xml)
hillslope_transport_sigmoid_100s.txt
The 2 issues are
There is another issue with concentrations that appears at time = 1 day near the left boundary. This issue is buried by issue 2 at time = 3 days. The position of the cell with an off concentration is suspiciously close to the position of the "hot" cells in the parallel runs.
The text was updated successfully, but these errors were encountered: