Advanced Stencil-Code Engineering
Future exascale computing systems with 107 processing units and supporting up to 1018 FLOPS peak performance will require a tight co-design of application, algorithm, and architecture aware program development to sustain this performance for many applications of interest, mainly for two reasons. First, the node structure inside an exascale cluster will become increasingly heterogeneous, always exploiting the most recent available on-chip manycore/GPU/HWassist technology. Second, the clusters themselves will be composed of heterogeneous subsystems and interconnects. As a result, new software techniques and tools supporting the joint algorithm and architecture-aware program development will become indispensable not only (a) to ease application and program development, but also (b) for performance analysis and tuning, (c) to ensure short turn-around times, and (d) for reasons of portability.
Project ExaStencils will investigate and provide a unique, tool-assisted, domain-specific codesign approach for the important class of stencil codes, which play a central role in high performance simulation on structured or block-structured grids. Stencils are regular access patterns on (usually multidimensional) data grids. Multigrid methods involve a hierarchy of very fine to successively coarser grids. The challenge of exascale is that, for the coarser grids, less processing power is required and communication dominates. From the computational algorithm perspective, domain-specific investigations include the extraction and development of suitable stencils, the analysis of performance-relevant algorithmic tradeoffs (e.g., the number of grid levels) and the analysis and reduction of synchronization requirements guided by a template model of the targeted cluster architecture. Based on this analysis, sophisticated programming and software tool support shall be developed by capturing the relevant data structures and program segments for stencil computations in a domain-specific language and applying a generator-based product-line technology to generate and optimize automatically stencil codes tailored to each application–platform pair. A central distinguishing mark of ExaStencils is that domain knowledge is being pursued in a coordinated manner across all abstraction levels, from the formulation of the application scenario down to the generation of highly-optimized stencil code.
For the developed unique and first-time seamless cross-level design flow, the three objectives of (1) a substantial gain in productivity, (2) high flexibility in the choice of algorithm and execution platform, and (3) the provision of the ExaFLOPS performance for stencil code shall be demonstrated in a detailed, final evaluation phase.
- Schmitt C., Schmid M., Kuckuk S., Köstler H., Teich J., Hannig F.:
Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution
In: Parallel Processing Letters 28 (2018), Article No.: 1850016
- Lengauer C., Apel S., Größlinger A., Grebhahn A., Kronawitter S., Bolten M., Rittich H., Hannig F., Köstler H., Rüde U., Teich J., Kuckuk S., Schmitt C.:
ExaStencils: Advanced Stencil-Code Engineering
Euro-Par: Parallel Processing Workshops (Porto, 25. August 2014 - 26. August 2014)
In: Proceedings of Euro-Par 2014: Parallel Processing Workshops, Berlin; Heidelberg: 2014
- Köstler H., Schmitt C., Kuckuk S., Kronawitter S., Hannig F., Teich J., Rüde U., Lengauer C.:
A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms
In: International Journal of Computational Science and Engineering 14 (2017), p. 150-163
- Grebhahn A., Siegmund N., Apel S., Kuckuk S., Schmitt C., Köstler H.:
Optimizing Performance of Stencil Code with SPL Conqueror
1st International Workshop on High-Performance Stencil Computations (HiStencils) (Vienna, 20. January 2014 - 20. January 2014)
In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations (HiStencils) 2014
- Membarth R., Reiche O., Schmitt C., Hannig F., Teich J., Stürmer M., Köstler H.:
Towards a Performance-portable Description of Geometric Multigrid Algorithms using a Domain-specific Language
In: Journal of Parallel and Distributed Computing 74 (2014), p. 3191-3201