Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Finally, a working implementation of particle near surface in FFT mode.
- additional required memory was described in comments in calculator.c - definition of ZsumShift is now different for sparse and FFT modes in parallel (relative to the local or global bottom dipoles respectively). Related difference is in definition of local_Nz_Rm (used for Sommerfeld table and, in FFT mode, for filling Rmatrix). - Sommerfeld table is now not calculated during prognosis, but corresponding memory is properly accounted for. - Current implementation requires additional forward fftY and fftZ at each iteration (but see issue 177). - BlockTranspose_Dm in comm.c was renamed into BlockTranspose_DRm, since it works both for D- and Rmatrix. - existing functions to index Dmatrix has been slightly changed to not use DsizeYZ and use y>=DsizeY instead of y>smallY. - functions and plans, not specific to Dmatrix were renamed (Y and Z forward transforms), now they have 'slice' in their names. - transposeYZ_Dm was changed to transpose (a general function), and transposeYZ is now just a wrapper around the latter. - version incremented to 1.3b1. Main computational bottleneck is computation of the table of Sommerfeld integrals. Currently it is approximately equivalent to 200 iterations (issue 176). So should not be a problem for applications with a larger number of iterations. New implementation was extensively tested against previous sparse version. Still, quantitative comparison with published literature data is required. Remaining limitation is that it doesn't work in OpenCL mode (see issue 101 for details, explicit exception was added to param.c). Changes to timing: - added time for 'init interaction' (significant when Sommerfeld table is calculated). - Time for 'init Dmatrix' now may include the time for initialization of Rmatrix (doesn't include time for Sommerfeld table). - Precise timing for Dmatrix now includes a separate line for initialization of Rmatrix (without details) Other: - scattering at exactly 90 degrees (along surface) for non-trivial surfaces is now handled by a special case to produce exact 0. This makes the output consistent, avoiding large integration errors for -phi_integr or alldir. - added explicit exception to forbid combination of -no_reduced_fft and -iter cgnr in MPI FFT mode (issue 174). Tests in tests/2exec/ have been significantly improved - added SURF_EXT flag, which allows running a large suite of tests, adding '-surf ...' option to each of them. - added a number of ignores, which are always active (decreases false-positives) - added a couple of macros in suite files (NOMPI and NOMPISEQ), which indicates that the line should be skipped for a specific comparison modes. - added a number of tests to the default suite: '-surf ...', '-int_surf', -scat_plane, '-shape read ... -grid ...', '-no_reduced_fft -iter cgnr' (see above)
- Loading branch information