Hybrid Monte Carlo algorithm for Two Color QCD with Wilson-Gor'kov fermions based on the algorithm of Duane et al. Phys. Lett. B195 (1987) 216.
There is "up/down partitioning": each update requires
one operation of congradq on complex vectors to determine
$$
\left(M^\dagger M\right)^{-1}\Phi
$$
where
matrix multiplies done using routines hdslash
and hdslashd
Hence, the number of lattice flavors Nf is related to the
number of continuum flavors N_f by
Fermion expectation values are measured using a noisy estimator.
on the Wilson-Gor'kov matrix, which has dimension 8 * kvol * nc * Nf
inversions done using congradp
, and matrix multiplies with dslash
,
dslashd
trajectory length is random with mean dt * stepl The code runs for a fixed number ntraj of trajectories.
Phi | pseudofermion field |
bmass | bare fermion mass |
fmu | chemical potential |
actiona | running average of total action |
Fermion expectation values are measured using a noisy estimator. The code produces the following outputs:
File Name | Data type |
---|---|
config.bβββkκκκmuμμμμjJJJsNXtNT.XXXXXX | Lattice configuration for given parameters. Last digits are the configuration number |
Output.bβββkκκκmuμμμμjJJJsNXtNT | Number of conjugate gradient steps for each trajectory. Also contains general simulation details upon completion |
bose.bβββkκκκmuμμμμjJJJsNXtNT | spatial plaquette, temporal plaquette, Polyakov line |
fermi.bβββkκκκmuμμμμjJJJsNXtNT | psibarpsi, energy density, baryon density |
diq.bβββkκκκmuμμμμjJJJsNXtNT | real |
SJH March 2005
Hybrid code, P.Giudice, May 2013
Converted from Fortran to C by D. Lawlor March 2021
This two colour implementation was originally written in FORTRAN for: S. Hands, S. Kim and J.-I. Skullerud, Deconfinement in dense 2-color QCD, Eur. Phys. J. C48, 193 (2006), hep- lat/0604004
It has since been rewritten in C and is in the process of being adapted for CUDA. We have sucessfully run on 7000+ Zen 2 cores, as well as A100 GPUs
Some adaptions from the original are:
- Mixed precision conjugate gradient
- Implementation of BLAS routines for vector operations
- Removal of excess halo exchanges
#pragma omp simd
instructions- Makefiles for Intel, GCC and AMD compilers with flags set for latest machines
- GSL ranlux support
- CUDA implementation.
Other works in progress include:
- Improved action
- SYCL implementation.
- Multi-GPU support
- CMake build system
- yaml input file
- Set lattice volume and CPU grid at runtime
- Higher order integrators. 11 stage 4th order non-gradient integrator implimented but no speedup yet
This code is written for MPI on Linux, thus has a few caveats to get up and running
-
In sizes.h, set the lattice size. By default we assume the spatial components to be equal
-
Also in sizes.h set the processor grid size by setting the values of
npx npy npz npt
These MUST be divisors of
nx ny nz nt
set in step one.
-
Compile the code using the desired Makefile. Please note that the paths given in the Makefiles for BLAS libraries etc. are based on my own system. You may need to adjust these manually.
-
Run the code. This may differ from system to system, especially if a task scheduler like SLURM is being used. On my desktop it can be run locally using the following command
mpirun -n<nproc> ./su2hmc <input_file>
nproc
is the number of processors, given by the product ofnpx npy npz npt
- If no input file is given, the programme defaults to midout. The default name is a historical one which goes back generations to the early days of Lattice QCD.
A sample input file looks like
0.00200 1.7 0.1780 0.00 0.000 0.0 0.0 500 20 1 1 100
dt beta akappa jqq thetaq fmu aNf stepl ntraj istart icheck iread
where
dt
is the step size for the updatebeta
is β, given up to three significant figuresakappa
is hopping parameter, given up to four significant figuresjqq
is the diquark source, given up to three significant figuresthetaq
is the diquark mixing anglefmu
is the chemical potentialaNf
is ignored. Originating in the Cornell group when Ken Wilson was still there, that molecular dynamics time-discretisation artifacts can be absorbed into renormalisation of the bare parameters of the lattice actionstepl
is the average number of steps per trajectory. For a single trajectory it times dt should equal 1ntraj
is the number of trajectoriesistart
signals a hot start (>=1) or cold start (<=0)icheck
is how often to print out a configuration. We typically use 5 and tune for 80% acceptance rateiread
is the starting configuration for continuation runs. If zero, start without reading
The bottom line of the input is ignored by the programme and is just there to make your life easier. Blank space does not matter, so long as there is some gap between the input parameters in the file and they are all on a single line.