Skip to content

Commit

Permalink
Add more compatible examples and update docs (#35)
Browse files Browse the repository at this point in the history
* Start

* Add new docs

* Add typo config file

* Try typos ignore

* Try typos ignore

* Try typos ignore

* Add euler ec

* Add shallow water ec

* Add shallow water source terms

* Add shallow water dirichlet

* Add euler shock capturing
  • Loading branch information
huiyuxie authored Sep 9, 2024
1 parent ba3bf7f commit c9185f8
Show file tree
Hide file tree
Showing 15 changed files with 1,057 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/SpellCheck.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ jobs:
uses: actions/checkout@v4
- name: Check spelling
uses: crate-ci/[email protected]
with:
config: typos.toml # specify the custom config file
262 changes: 262 additions & 0 deletions docs/dev_env_info.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,262 @@
# Development Environment Information

## Recent Update

The hardware has been updated for this project recently. The new NVIDIA GeForce Series has been launched to continue implementation and testing of this project. Here is the detailed environment information for this project now.

The following shows the specific CPU information.

```Bash
huiyu@huiyuxps15:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 20
On-line CPU(s) list: 0-19
Vendor ID: GenuineIntel
Model name: 13th Gen Intel(R) Core(TM) i9-13900H
CPU family: 6
Model: 186
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 1
Stepping: 2
BogoMIPS: 5990.39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse ss
e2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop
_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_time
r aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enha
nced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdsee
d adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni umip waitpkg gfni vae
s vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
Caches (sum of all):
L1d: 480 KiB (10 instances)
L1i: 320 KiB (10 instances)
L2: 12.5 MiB (10 instances)
L3: 24 MiB (1 instance)
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Mitigation; Enhanced IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Srbds: Not affected
Tsx async abort: Not affected
```

The following shows the specific GPU information.

```Bash
huiyu@huiyuxps15:~$ nvidia-smi
Thu May 16 17:26:16 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.07 Driver Version: 546.12 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4060 ... On | 00000000:01:00.0 Off | N/A |
| N/A 43C P3 11W / 45W | 0MiB / 8188MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
```

## Legacy (AWS)

This project were implemented and tested on AWS EC2 instances. So far, types of EC2 instances used in this project include: `p3.2xlarge` (single-GPU) and `g4dn.12xlarge` (multi-GPU). For more information about the EC2 instances, please refer to [AWS EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/).

The following information is about the environment of the EC2 instances used in this project.

### AWS `p3.2xlarge`
The following shows the specific CPU information of the `p3.2xlarge` instance used in this project.

```Bash
ubuntu@ip-172-31-7-163:~/trixi_cuda$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
CPU family: 6
Model: 79
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 1
CPU max MHz: 3000.0000
CPU min MHz: 1200.0000
BogoMIPS: 4600.04
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscal
l nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdra
nd hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erm
s invpcid rtm rdseed adx xsaveopt
Virtualization features:
Hypervisor vendor: Xen
Virtualization type: full
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 1 MiB (4 instances)
L3: 45 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Meltdown: Mitigation; PTI
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Retbleed: Not affected
Spec store bypass: Vulnerable
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
```

The following shows the specific GPU information of the `p3.2xlarge` instance used in this project.

```Bash
ubuntu@ip-172-31-7-163:~/trixi_cuda$ nvidia-smi
Sat Aug 26 00:38:06 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-SXM2-16GB On | 00000000:00:1E.0 Off | 0 |
| N/A 47C P0 25W / 300W| 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
```

### AWS `g4dn.12xlarge`
The following shows the specific CPU information of the `g4dn.12xlarge` instance used in this project.

```Bash
ubuntu@ip-172-31-4-230:~/trixi_cuda$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
CPU family: 6
Model: 85
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 1
Stepping: 7
BogoMIPS: 4999.99
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clf
lush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch
_perfmon rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_fre
q pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popc
nt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dno
wprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms i
nvpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512
bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 768 KiB (24 instances)
L1i: 768 KiB (24 instances)
L2: 24 MiB (24 instances)
L3: 35.8 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-47
Vulnerabilities:
Gather data sampling: Unknown: Dependent on hypervisor status
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unkno
wn
Meltdown: Mitigation; PTI
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unkno
wn
Retbleed: Vulnerable
Spec rstack overflow: Not affected
Spec store bypass: Vulnerable
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affect
ed
Srbds: Not affected
Tsx async abort: Not affected
```

The following shows the specific GPU information of the `g4dn.12xlarge` instance used in this project.

```Bash
ubuntu@ip-172-31-4-230:~/trixi_cuda$ nvidia-smi
Sat Dec 30 20:19:54 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1B.0 Off | 0 |
| N/A 20C P8 8W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 Tesla T4 On | 00000000:00:1C.0 Off | 0 |
| N/A 20C P8 10W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 Tesla T4 On | 00000000:00:1D.0 Off | 0 |
| N/A 21C P8 9W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 20C P8 9W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
```
65 changes: 65 additions & 0 deletions examples/euler_shockcapturing_1d.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
using Trixi, TrixiGPU
using OrdinaryDiffEq

# The example is taken from the Trixi.jl

###############################################################################
# semidiscretization of the compressible Euler equations

equations = CompressibleEulerEquations1D(1.4)

initial_condition = initial_condition_weak_blast_wave

surface_flux = flux_lax_friedrichs
volume_flux = flux_shima_etal
basis = LobattoLegendreBasis(3)
indicator_sc = IndicatorHennemannGassner(equations, basis,
alpha_max = 0.5,
alpha_min = 0.001,
alpha_smooth = true,
variable = density_pressure)
volume_integral = VolumeIntegralShockCapturingHG(indicator_sc;
volume_flux_dg = volume_flux,
volume_flux_fv = surface_flux)
solver = DGSEM(basis, surface_flux, volume_integral)

coordinates_min = -2.0
coordinates_max = 2.0
mesh = TreeMesh(coordinates_min, coordinates_max,
initial_refinement_level = 5,
n_cells_max = 10_000)

semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver)

###############################################################################
# ODE solvers, callbacks etc.

tspan = (0.0, 1.0)
ode = semidiscretize_gpu(semi, tspan) # from TrixiGPU.jl

summary_callback = SummaryCallback()

analysis_interval = 100
analysis_callback = AnalysisCallback(semi, interval = analysis_interval)

alive_callback = AliveCallback(analysis_interval = analysis_interval)

save_solution = SaveSolutionCallback(interval = 100,
save_initial_solution = true,
save_final_solution = true,
solution_variables = cons2prim)

stepsize_callback = StepsizeCallback(cfl = 0.8)

callbacks = CallbackSet(summary_callback,
analysis_callback, alive_callback,
save_solution,
stepsize_callback)

###############################################################################
# run the simulation

sol = solve(ode, CarpenterKennedy2N54(williamson_condition = false),
dt = 1.0, # solve needs some value here but it will be overwritten by the stepsize_callback
save_everystep = false, callback = callbacks);
summary_callback() # print the timer summary
Loading

0 comments on commit c9185f8

Please sign in to comment.