CS-184 Final Project: Fluid and Smoke Animation on GPU
1. Abstract
In this final project, we implemented a smoke simulator on the GPU and, leveraging the parallel nature of GPUs, designed and implemented two different parallel solvers. Our carefully optimized MGPCG solver achieved excellent convergence results. Additionally, we also implemented wavefront path tracing, adapted to the GPU architecture, and extended it to volume rendering. Although we attempted to port the free surface solver to the CPU, we were unable to complete it due to the framework's lack of support for atomic operations. Therefore, we focused on optimizing the existing CPU solver. Ultimately, we created a scene containing both smoke simulation and free surface fluid simulation, rendered it into a video using our GPU renderer.
2. Technical approach
2.0 Preliminary: Volumetric Path Tracing
Path tracing is a physically based rendering technique that simulates the propagation of light through a scene to produce realistic images. In the course before, we have learned about path tracing, where light rays may collide with objects in the scene and be reflected, refracted, or absorbed. The tracking process ends when it intersects with a light source, when it does not intersect with the scene, or when it reaches Russian Roulette (RR).
Later we will review our class knowledge and introduce some new important concepts.
2.0.1 Path Tracing(Lessons learned)
In our class, we learn the preliminary ray tracing model shown as below figure. We can summarize it as: we start with primary rays, which are rays that go from our eye point through a pixel on the image plane and intersect some geometry in our scene. We then trace secondary rays until we hit a non-specular surface, or reach the maximum desired recursion level. At each hit point, we trace shadow rays to our light source(s) to test light visibility. The final pixel color is the weighted sum of the contributions along the rays. This allows us to have more sophisticated effects (e.g. specular reflection, refraction, and shadows) — but there's much more we can do to derive a physically-based illumination model.
2.0.2 Volumetric Path Tracing
Volumetric Path Tracing extends the basic ray tracing model to account for participating media such as fog, smoke, or clouds within the scene. In addition to tracing rays through geometric surfaces, Volumetric Path Tracing also simulates the interaction of light with these participating media.
The core concept in it is the Radiative Transfer Equation:
In the radiative transfer equation, indicates how the radiance, which is the light intensity emitted or reflected by a point in a given direction, changes at point in the direction as time progresses.
The coefficients and represent the medium's propensity to absorb and scatter light, respectively. The radiance at the point is denoted by, and is any radiance that might be emitted by the point itself. The phase function gives the likelihood of light scattering from one direction to another. The integral accumulates the scattering contributions from all possible directions across the sphere, summing to determine the total effect on radiance at the point due to scattering.
This equation plays a crucial role in simulating how light travels through and interacts with various media in computer graphics, providing a more realistic depiction of scenes with participating media such as fog or smoke.
2.1 Wavefront Path Tracing on GPU
In our project, we use wavefront path tracing, which is an advanced technology for path tracing on GPU, designed to overcome the performance bottleneck of traditional megakernel methods. In traditional methods, the divergence in processing complex materials and control flow often leads to underutilization of GPU resources and low execution efficiency. The Wavefront approach significantly improves rendering efficiency by segmenting processing paths and optimizing GPU memory usage.
The core of the Wavefront Path Tracing method is to divide the path tracing task into three stages: logic, material, and ray casting. Each stage consists of one or multiple individual kernels. Figure 2 illustrates the design.
Below we introduce these three stages in detail(Click the arrow to see)
- Logic Stage
The Logic Stage in path tracing governs the fundamental operations required for ray progression, maintaining fidelity to physical and optical principles. It orchestrates the calculation of multiple importance sampling weights, updates path flux, accumulates direct light source contributions, evaluates path termination conditions, manages terminated path contributions to the final image, generates new light source samples, identifies material properties at collision points, and initiates requests for material evaluation. This comprehensive approach ensures systematic adherence to rendering standards and facilitates the accurate portrayal of scenes.
- Material stage
This part is mainly about the interaction with different materials.
In Wavefront Path Tracing, paths undergo classification into ended paths, which initiate new camera rays for subsequent cycles, and ongoing paths, which undergo material property analysis at the impact point. To enhance rendering efficiency, different materials are allocated to specific processing cores according to their complexity, minimizing control flow divergence. Based on material evaluation outcomes, new rays, such as extended or shadow rays, may be generated and forwarded for further processing in the rendering pipeline. In case of ray generation issues, like incorrect directionality, termination of the path occurs, ensuring accurate rendering outcomes and optimal resource utilization.
- Ray cast stage
The ray casting stage is the final step in Wavefront Path Tracing. It calculates intersections between rays and scene objects, gathering information like position and material type. This data is then processed to determine physical properties. Results are cached for future use in advancing the path and evaluating materials.
- Logic Stage
In addition to these three stages, some memory tricks are used to speed up tracking.
The Wavefront approach in path tracing uses a memory-based system rather than local registers for storing path status data, which is mitigated by adopting a Structure of Arrays (SOA) layout to reduce memory access frequency. Additionally, the use of pre-allocated queue systems in Wavefront Path Tracing enhances rendering optimization. These queues store path indices in global memory buffers and utilize atomic operations for data accuracy and synchronization. Efficiency is further improved by merging atomic operations within a warp through warp-wide voting operations, which optimizes memory access and reduces performance issues caused by individual atomic operations.
2.1.1 Our connection with GPU
In our rendering framework, the phase function, analogous to the BRDF for surfaces, accounts for the scattering events that alter light directions and is evaluated during the material stage alongside other BRDFs. To unbiasedly estimate the transmittance through non-uniform media, we employ a method known as null scattering[5], which randomly decides whether the light will 'continue forward' or 'scatter and change direction'. As scenes may contain multiple media types, we iterate until a ray necessitates volume scattering or hits a surface, both scenarios being part of the material stage unless it strikes a light source or finds no intersection. The third stage of our process involves advancing the ray until scattering occurs, seamlessly integrating volume rendering within the Wave Path Tracing (WPT) framework.
While a perfect integration of GPU and Volume Path Tracing (VPT) remains elusive, our approach minimizes memory access and divergence during this phase. We optimize using CUDA's shared memory to cache frequently used data and employ Morton coding to sort rays for spatial coherence within warps, significantly improving performance in our WPT setup.
As for “the problems we encountered and how we tackled them”, to facilitate understanding, we have described different technical parts separately and not unified them together.
2.2 Smoke simulation
Smoke simulations are usually based on the Navier-Stokes equations describing fluid motion.
The Navier-Stokes equations are fundamental in fluid mechanics, describing the motion of fluids such as gases and liquids. These equations consist of two parts: the momentum equation and the continuity equation.
- Momentum Equation
The momentum equation describes changes in the fluid velocity, incorporating effects of fluid inertia, internal viscosity, external forces applied, and pressure gradients. For an incompressible, viscous fluid, the momentum equation typically takes the form:
Where: is the fluid velocity field, is time, is the fluid density (constant for incompressible fluids), is the fluid pressure, is the kinematic viscosity coefficient, and represents external forces per unit mass (such as gravity).
- Continuity Equation
The continuity equation ensures the conservation of fluid mass, which for incompressible fluids means that the density of the fluid does not change during its motion. This is expressed as the divergence of the velocity field being zero:
In smoke simulation, to maintain the incompressibility of the fluid, the pressure Poisson equation needs to be solved. This equation is derived from combining the continuity equation and momentum equation of the Navier-Stokes equations to find the pressure field that results in a divergence-free velocity field.
After discretizing the time component of the momentum equation and substituting it into the continuity equation, we obtain the general form of the pressure Poisson equation:
This equation indicates that the Laplacian of the pressure equals the divergence of the nonlinear terms, viscous terms, and external forces of the velocity field. By solving this equation, the pressure field is updated, thereby adjusting the velocity field to meet the incompressibility condition .
Through solving these equations, the dynamics of smoke and other fluids can be realistically reproduced in computer graphics, making the visual effects both beautiful and scientifically accurate.
2.3 MGPCG Algorithm(Used for smoke)
Above we have a basic introduction to smoke. A major challenge in high-performance smoke simulation is how to efficiently solve the system of fluid dynamics equations, especially when simulating large-scale dynamic scenarios. Although the traditional direct solution method is simple, it is time-consuming and resource-consuming when processing large-scale data.
MGPCG is an efficient solver for Poisson equations over irregular voxelized domains, accommodating Dirichlet and Neumann boundary conditions. It employs a multi-grid loop as a preconditioner within the conjugate gradient method, utilizing a geometric multi-grid scheme to enhance convergence speed and robustness. Designed for parallel execution, MGPCG is implemented on shared memory platforms, optimizing for low bandwidth and memory usage.
2.3.1 Method Overview of MGPCG
- Multigrid Preconditioning:
Multigrid preconditioning is a technique based on solving problems at different scales to accelerate the solution process. This method starts with the finest grid and progressively builds coarser grid levels, each with different resolution accuracy.
- Refinement and Coarsening: The process of grid refinement and coarsening is carried out by restricting the solution from finer grids to coarser ones and interpolating the solution from coarse grids back to fine grids. This method is implemented through the V-Cycle algorithm described in Algorithm 1, which includes smoothing operations, restriction, and interpolation steps.
- Smoothers: Simple iterative methods such as weighted Jacobi or Gauss-Seidel are used at each grid level to reduce errors. These smoothing operations help to eliminate high-frequency error components, thus improving the quality of the numerical solution.
- Restriction and Interpolation: The restriction operation transfers residuals from fine to coarse grids, and interpolation methods transfer the corrected solutions back to fine grids. These operations are performed using specific operators (e.g., full-weighting restriction operator and bilinear interpolation operator), which transfer information between grid levels.
- Multigrid as a Preconditioner for Conjugate Gradient Method:
Combining multigrid preconditioning with the Conjugate Gradient method can significantly accelerate the convergence rate of solving linear systems, especially when dealing with large-scale problems.
Specifically, we adopted and implemented the algorithm in the paper, as shown below:
2.3.2 Our implement
Since the equation systems solved by the bottom solver are quite small, using multiple kernel calls for iterative solving is not efficient and results in significant waste. We have designed a specialized solving kernel that stores all the small equation systems on the coarsest grid into shared memory. This allows repeated iterations within the kernel function itself, greatly reducing communication with the host machine.
2.4 Fluid simulation
We have introduced the Navier-Stokes equation in smoke simulation. Fluid simulation is also based on this equation. The difference is that in liquid simulation, specific methods are employed to handle the unique challenges of modeling incompressible and highly interactive fluids. Unlike smoke simulation, which primarily deals with the dispersion and diffusion within a more compressible medium, liquid simulation requires careful handling of free surface flows, viscosity control, and the interaction dynamics between particles.
Initially, we employed the PIC (Particle-In-Cell[9]) method coupled with a naive spatial hashing approach to locate neighboring particles for reconstructing the Signed Distance Function (SDF[10]). This approach faced two major issues: excessive numerical viscosity and poor memory access locality during SDF reconstruction, which became a performance bottleneck.
To mitigate the issue of numerical viscosity, we later adopted the FLIP (Fluid Implicit Particle[11]) algorithm. While FLIP inherently suffers from high noise levels, we combined it with PIC interpolation to strike a balance, reducing numerical viscosity and achieving smoother fluid animations.
Moreover, we enhanced our spatial hashing by implementing Morton coding[13] for mapping, which effectively utilized cache and improved spatial locality. This allowed us to simulate with a higher number of particles, significantly improving the quality of the fluid simulation.
Additionally, we began calculating the CFL (Courant-Friedrichs-Lewy[12]) number for the velocity field to adaptively adjust the time step, thereby enhancing accuracy. Although this adjustment impacted performance, it notably improved the quality of the fluid simulation.
3. Results
3.1 Smoke simulation
For the simulation of smoke, we used two methods. The first method is damped jacobi iteration. It takes 5000 iterations to reach the convergence state. In the table below, we record the changes in indicators(contain residual) with the number of iterations.
iter | residual |
4996 | 0.091385 |
4997 | 0.091331 |
4998 | 0.091308 |
The second method is parallel MGPCG iteration, which can reach convergence in about twenty iterations. The specific residual indicators are as shown in the table below.
iter | residual |
23 | 0.051727 |
24 | 0.036713 |
25 | 0.026437 |
Below we render a schematic diagram except for the smoke simulation.
3.2 Fluid simulation
Based on the method we introduced before, we simulated the flow of liquid, and the preliminary result diagram obtained is as follows:
Some problems can be seen from the effect, such as water droplets appearing and disappearing, and the fluid surface is very unstable (noisy)
We made some adjustments to slightly increase the radius of the particles, making it easier for the particle details to be captured by the mesh. At the same time, each frame is divided into multiple sub-steps, and the size of the sub-steps is adaptively adjusted by calculating the CFL number of the flow field, making the solution more accurate and the fluid more stable. The effect obtained is as follows
3.3 Smoke plus Fluid with cute bunny
We integrated the bunny in the class and the fluid and smoke in this project to get the following rendering video.
4. Reference
- Laine, Samuli, Tero Karras, and Timo Aila. "Megakernels considered harmful: Wavefront path tracing on GPUs." Proceedings of the 5th High-Performance Graphics Conference. 2013.
- "Ray Tracing and Acceleration Structures." cs184.eecs.berkeley.edu, https://cs184.eecs.berkeley.edu/sp24/lecture/9/ray-tracing-and-acceleration-str.
- McAdams, Aleka, Eftychios Sifakis, and Joseph Teran. "A Parallel Multigrid Poisson Solver for Fluids Simulation on Large Grids." Symposium on Computer Animation. Vol. 65. 2010.
- Clv, I. (2024, April 30). VPT on GPU. Lucidic Computing. https://clv-iclucia.github.io/2024/04/24/VPT-on-GPU/
- Miller, Bailey, Iliyan Georgiev, and Wojciech Jarosz. "A null-scattering path integral formulation of light transport." ACM Transactions on Graphics (TOG) 38.4 (2019): 1-13.
- Bridson, R. (n.d.). Fluid Simulation for Computer Graphics. Retrieved from https://www.cs.ubc.ca/~rbridson/fluidsimulation/fluids_notes.pdf
- ICLUCIA. (2022, September 13). CSE 272 HW. Retrieved from https://clv-iclucia.github.io/2022/09/13/CSE-272-Hw/
- Pharr, M., Jakob, W., & Humphreys, G. (2018). Physically Based Rendering: From Theory to Implementation (3rd ed.). Retrieved from https://www.pbr-book.org/3ed-2018/contents
- Tskhakaya, D. (2008). The particle-in-cell method. In Computational many-particle physics (pp. 161-189). Berlin, Heidelberg: Springer Berlin Heidelberg.
- Bridson, Robert. Fluid simulation for computer graphics. AK Peters/CRC Press, 2015.
- Brackbill, Jeremiah U., Douglas B. Kothe, and Hans M. Ruppel. "FLIP: a low-dissipation, particle-in-cell method for fluid flow." Computer Physics Communications 48.1 (1988): 25-38.
- Courant, Richard, Kurt Friedrichs, and Hans Lewy. "Über die partiellen Differenzengleichungen der mathematischen Physik." Mathematische annalen 100.1 (1928): 32-74.
- Morton, Guy M. "A computer oriented geodetic data base and a new technique in file sequencing." (1966).
5. Contributions from each team member
Xuanye Chen led this project and was responsible for wavefront path tracing on GPU and subsequent fluid and smoke simulations.
Zhenzhe Li was responsible for the final deliverable materials of the report.
Ruijie Jian built the project on Windows, deployed tests for some GPU utils, helped with building the renderer, and made the final project video.
Huasen Xi mainly focuses on the MGPCG algorithm and correct some bugs.
They participated in every discussion together and contributed to every deliverable.
+
+