Castro

Introduction

Castro is an astrophysical radiation hydrodynamics simulation code based on AMReX framework.

We study Castro version 5e0a1b9cbc259f4dd17f5453ba59808b4da5c3ab, and profile Casto’s Exec/hydro_tests/Sedov example using its inputs.2d.cyl_in_cartcoords input.

To compile Castro, we setup the following variables in GNUmakefile:

USE_CUDA=TRUE
CUDA_ARCH=TRUE
DIM=2
USE_MPI=FALSE

Profiling

For a small scale run, we setup max_step=20 in inputs.2d.cyl_in_cartcoords. To generate the data flow graph for Castro, along with redundancy metrics, we can use the gvprof script directly. For other fine-grained metrics, we can use gvprof if GPU control flow graphs are not required. Otherwise, we recommend using hpctoolkit to perform step-by-step profiling.

Optimization

  • data_flow - redundant values

AMReX_Interp_2D_C.H: 344. When castro invokes cellconslin_slopes_mmlim, which is an internal function provided by AMReX, it performs slope(i, j, n) *= a for each output. With the inputs.2d.cyl_in_cartcoords input, somehow a is mostly 1.0. Thereby, we can save one load and one store for each output if we conditionally perform slope(i, j, n) *= a. Though this optimization does not achieve a significant speedup, it is worth mentioning if this it also benefits other applications that use AMReX.