Castro
Introduction
Castro is an astrophysical radiation hydrodynamics simulation code based on AMReX framework.
We study Castro version 5e0a1b9cbc259f4dd17f5453ba59808b4da5c3ab
,
and profile Casto’s Exec/hydro_tests/Sedov
example using its inputs.2d.cyl_in_cartcoords
input.
To compile Castro, we setup the following variables in GNUmakefile
:
USE_CUDA=TRUE
CUDA_ARCH=TRUE
DIM=2
USE_MPI=FALSE
Profiling
For a small scale run, we setup max_step=20
in inputs.2d.cyl_in_cartcoords
.
To generate the data flow graph for Castro, along with redundancy metrics, we can use the gvprof
script directly.
For other fine-grained metrics, we can use gvprof
if GPU control flow graphs are not required. Otherwise, we recommend using hpctoolkit to perform step-by-step profiling.
Optimization
data_flow - redundant values
AMReX_Interp_2D_C.H: 344
. When castro invokes cellconslin_slopes_mmlim
, which is an internal function provided by AMReX, it performs slope(i, j, n) *= a
for each output.
With the inputs.2d.cyl_in_cartcoords
input, somehow a is mostly 1.0.
Thereby, we can save one load and one store for each output if we conditionally perform slope(i, j, n) *= a
.
Though this optimization does not achieve a significant speedup, it is worth mentioning if this it also benefits other applications that use AMReX.