EuroEXA: a co-design project for exascale computing

Graham Riley, University of Manchester, UK

A brief introduction to the EuroEXA project will be presented. EuroEXA is based on ARM processors with FPGA support for application (and communication) acceleration, with the ultimate aim of operating at exascale within a reasonable power budget. Some examples of the programming issues faced on such a system will be discussed, along with the strategies currently being investigated to support the porting of application codes from the ESM community to the pre-exascale testbeds that will emerge from EuroEXA over the next few years.

XIOS integration for OpenIFS: Computational Aspects and Performance Evaluation

Mario Acosta, BSC, ES

Weather and climate models are being more complex in order to solve new features as eddies and clouds properly. For example, increasing the spatial resolution of their simulations or adding new additional coupled components. As a consequence, more computational resources are needed to maintain a suitable execution time. Moreover, these new coupled and high-resolution simulations will produce massive outputs where post-processing output will a considerable time. This means that the management of Input/Output (I/O) and post-processing will increase in cost, complexity and size and additional approaches are needed to make our models more scalable. Moreover, additional cost is needed to convert output data of IFS/OpenIFS to standard formats as NetCDF, a requirement for example for CMIP project. These are some of the reasons why the XML I/O Server (XIOS), an asynchronous MPI parallel I/O server that is able to post-process data online and write it in NetCDF format, is being integrated to OpenIFS.

In this work, we present computational aspects and the performance evaluation done for this integration, showing how to do an efficient integration and which particular aspects have to be take into account to increase the computational performance. Different issues related to XIOS and the integration process will be explained. From issues which could reduce the performance of a XIOS integration to general aspects of atmospherical models needed for the integration. Different techniques implemented will be also explained, such as shared memory techniques using OpenMP for data movement or asynchronal communications to overlap OpenIFS calculations with XIOS work. Profiling analysis will be also shown to explain all the details.

Efficiency and scalability of numerical algorithms - from ESCAPE to ESCAPE-2

Peter Bauer, ECMWF, UK

Fulfilling the requirements for efficiency gains in weather and climate prediction will dominate model development for the next decade or more. Apparent speed-up factors of O(1000) can only be achieved from a significant and concerted investment in numerical methods, programming models and new, energy-efficient processor technologies at the same time. The European Commission funded projects ESCAPE and ESCAPE-2 pursue this development strategy focusing on selected function model components - called weather and climate dwarfs - that drive computing cost. The ESCAPE project compares and optimizes performance on conventional and novel processor types employing various programming options also taking into account alternative model formulations. ESCAPE-2 extends this approach to entire models, and lays the foundation for a weather and climate domain specific language concept that could provide a long-term solution for achieving performance and portability of codes for our community.

CLAW Compiler: Abstractions for Weather and Climate Models

Valentin Clement, MeteoSwiss, CH

Adapting complex code to take advantage of new computing HPC architectures is a cumbersome task. The development life cycle of weather models does not match the fast pace of the new hardware releases. Therefore, restructuring the code and applying specific architecture optimization is often needed to get optimal performance when porting models to those new architectures. This leads to multiple specific versions of the same code optimized for a single target supercomputer and not portable among them.

In order to keep a single source code and get performance portability on different hardware, we propose a one-column model abstraction for physical parameterizations. 

It supports domain scientists who do not want to be concerned about HPC optimizations for multiple columns.

We are developing a directive language and a tool named CLAW that is able to apply automatic code transformation on abstracted code to produce parallelized versions for dedicated target architecture.

The EoCoE Centre of Excellence

Edoard Audit, CEA, FR

EoCoE, the Energy oriented Center of Excellence, contributes to the transition to sustainable energy sources and usage via targeted support for numerical modelling in four pillars: Meteorology, Materials, Water and Fusion. 

These pillars are embedded within a transversal multidisciplinary effort providing high-end expertise in applied mathematics and High Performance Computing (HPC).  This talk will present the main achievements of the project so far and the vision for its second phase if it is accepted.

Atlas, a library for numerical weather prediction and climate modelling

Willem Deconinck, ECMWF, UK

The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge to adapt to paradigm shifts imposed by new hardware developments. The emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP software, triggering a rethink of design choices for future software frameworks. On the other hand, there is a drive to increase the model complexity to include ever more processes of the whole Earth system. Some of these processes may require computations on grids of different type or resolution than the atmospheric grid. Multiple grid structures may be required as part of the numerical filtering strategy for atmospheric wave motions or to simply save computational cost of selected physical processes. These different grids may have different domain decompositions for parallel computations, and different parallelisation strategies. Moreover the internal memory layout for a field that is optimal for one numerical algorithm may not be optimal for another. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago.

To address the above mentioned challenges, the European Centre for Medium Range Weather Forecasts (ECMWF) is developing Atlas, a new modern software framework that is designed to take into account these new developments. Atlas helps to accommodate flexibility in hardware and software choices as well as increasing model complexity. Atlas is not a new model but rather forms a foundation layer that new models can be built with, or for existing models to be complemented or redesigned with.

In this talk, we demonstrate how Atlas is used to complement ECMWF's Integrated Forecasting System (IFS) model to enable a number of physical processes to be implemented on multiple grids.

The EuroHPC declaration

Sanzio Bassini, CINECA, IT

HPC is an essential method to address scientific challenges and support technology innovation as the new therapies based for example on personalised and precision medicine, deciphering the functioning of the human brain, forecasting climate evolution, observing space, preventing and managing large-scale natural disasters, and accelerating the design of new materials. In such a no precedent development context, the European Commission very ambitious program towards exascale computing will be presented.

PSyclone: a domain-specific compiler for finite element/difference Earth-system modelling codes

R. Fort & A. Porter, STFC, UK

Earth-system models tend to be large, complex codes developed by large teams of scientists over periods of years. However, the scale of the problems to be simulated calls for the highest levels of computational performance. Achieving good performance when both computer architectures and the underlying code base are constantly evolving is a complex challenge. In recent years, the use of Domain-Specific Languages (DSLs) as a potential solution to this problem has begun to be investigated.

The UK Met Office's LFRic project is developing a new, Finite Element dynamical core and has adopted a DSL approach. In this talk we will describe this work and the functionality of the domain-specific compiler, PSyclone, which has been developed to process the (serial) code written by the natural scientists and generate the code required to run on massively parallel machines.

Output whole CMIP6 data through the news XIOS parallel workflow functionalities

Yann Meurdesoif, IPSL, FR

XIOS is an HPC library dedicated to data flow management for ESM models. It provides a flexible and efficient way to read or write data on parallel file systems using asynchronous parallel I/O server technology. XIOS also provides powerful online data processing before output such as time integration, combination with arithmetic operators, vertical and horizontal interpolation, remapping (including 2nd order conservative), reduction operations, etc.

To output CMIP6 data, CNRM and IPSL experienced the same workflow based on XIOS in order to produce data suitable to publication requirements. The costly and traditional post-processing phase is now performed "In Situ" at run time and in a parallel way on the whole computing nodes, before to be written to the parallel filesystem through the I/O servers technology. In the talk, we will describe the different steps of this work, as a practical case study.

ESiWACE - Supporting very high resolution climate simulations in Europe

Joachim Biercamp, DKRZ, DE

ESiWACE (Excellence in simulation of climate and weather in Europe) is one of nine EU funded Centres of Excellence for computing applications. ESiWACE aims to improve efficiency and productivity on numerical weather and climate simulations on HPC platforms by supporting the end-to-end workflow of Earth system modelling. The talk will give an overview of goals, achievements and future perspectives of the project.

ESMF Strategies to Address HPC Challenges

Raffaele Montuoro, NOAA/CIRES, US

In this talk we will discuss how the Earth System Modeling Framework (ESMF) project is addressing challenges in Earth system modeling on HPC platforms. These challenges include grids of arbitrarily high resolution, increases in model complexity through the addition of new ensemble methods, components, and processes, and changes in computing architecture. ESMF responses to these challenges span the near- to long-term. We review ESMF piecewise regridding, the Cupid Integrated Development Environment, the development of a new "mapper" class in ESMF to allow for adaptive mapping to resources, and other strategies.

DSL toolchains and performance optimizations for weather and climate codes

Carlos Osuna, MeteoSwiss, CH

We present a novel compiler toolchain for domain-specific languages (DSLs), based on the GridTools ecosystem, that allows for easier design of high-level DSLs for weather and climate models.

The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models.

DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. 

However, DSLs are developed for specific domains or individual applications.

Therefore there is little reuse between existing complex tools, leading to high maintenance costs. 

A compiler toolchain for DSLs on weather and climate models provides common efficient code optimizers and code generation for multiple architectures, that can be reused across multiple tools.

We present results for the COSMO dynamical cores, and evaluate the approach comparing with other existing approaches.

EXDCI – Supporting the European HPC ecosystem towards the Exascale endeavour

Serge Bogaerts, PRACE Aisbl

EXDCI is an EU-funded CSA project, in which PRACE and ETP4HPC have worked towards the development of a common European HPC Strategy by getting the ecosystem stakeholders to share their perspectives on positioning Europe on the Exascale roadmap. Achievements of the project include the preparation of the Strategic Research Agenda (SRA), the Scientific Case for HPC, maintaining international cooperation via BDEC workshops, the coordination of training and job offerings via a web portal, the identification of relevant KPI for the HPC ecosystem, and the organisation of the European HPC Summit Week. An EXDCI-2 proposal is underway to continue these coordination activities in the framework of this very dynamic European HPC ecosystem.

 

Latest developments of the OASIS3-MCT coupler for improved performance

S. Valcke, L. Coquart, A. Craig, G. Jonville, E. Maisonnave, A. Piacentini

We will present the developments done in the OASIS3-MCT coupler during the last 24 months to improve its parallel efficiency. The most important improvements concern the communication scheme, which now uses the mapping weights to define the intermediate mapping decomposition and the hybrid OpenMP/MPI parallelisation introduced in the SCRIP library for the mapping weight calculation. Efforts were also spent to improve the initialisation with the update of the MCT library from version 2.8 to 2.10.beta1 and to introduce new options introduced in the global CONSERV operation. Finally, additional results obtained with IS-ENES2 coupling technology benchmark, either testing new options or running on Marconi KNL, will be shown.

POP CoE: understanding applications and how to prepare for exascale

Jesus Labarta, BSC, ES

Aiming at exascale is a tough objective for which just developing the architectures is only one part of the story. Being able to program and achieve efficiency on them is a real challenge that we believe requires changes in the mentality how we as community look at HPC. Overall, I will present a vision of how the evolution to multicores, heterogeneous systems and pre-exascale architectures is shaking our world and how I feel we should be facing it.

An update on the ETP4HPC and Extreme Scale Demonstrators

Thomas Eickermann, Institute for Advanced Simulation, Juelich Supercomputing Centre, DE

The European Technology Platform for High-Performance Computing (ETP4HPC) is an industry-led group of public and private HPC stakeholders. Its main mission is to give advice to the European Commission concerning research priorities in the area of HPC in the form of a regularly updated Strategic Research Agenda (SRA). The presentation will provide an update on the ETP's activities and in particular on the Extreme Scale Demonstrators (EsD), which have been introduced in the latest SRA. Their concept, to integrate results from recent European R&I actions into usable HPC systems, has recently been adopted by the EC.

Challenges of Exascale Computing, Redux

Paul Messina, Argonne Distinguished Fellow, Director, Computational Science Division Argonne National Laboratory, US

In 2016, the U.S. Department of Energy established the Exascale Computing Project (ECP) – a joint project of the DOE Office of Science and the DOE National Nuclear Security Administration – that will result in a broadly usable exascale ecosystem and prepare mission critical applications to take advantage of that ecosystem. 

This project aims to create an exascale ecosystem that will:

  • enable classical simulation and modeling applications to tackle problems that are currently out of reach,
  • enable new types of applications to utilize exascale systems, including ones that use machine learning, deep learning, and large-scale data analytics,
  • support widely-used programming models as well as new ones that promise to be more effective on exascale architectures or for applications with new computational patterns, and
  • be suitable for applications that have lower performance requirements currently, thus providing an on ramp to exascale should their future problems require it.

Balancing evolution with innovation is challenging, especially since the ecosystem must be ready to support critical mission needs of DOE, other Federal agencies, and industry, when the first DOE exascale systems are delivered in 2021. The project utilizes a co-design approach that uses over two dozen applications to guide the development of supporting software and R&D on hardware technologies as well as feedback from the latter to influence application development.

This presentation will focus on my assessment of the challenges for achieving an exascale ecosystem, based on the first two years of this project.

 

EuroEXA: a co-design project for exascale computing

Graham Riley, University of Manchester, UK

A brief introduction to the EuroEXA project will be presented. EuroEXA is based on ARM processors with FPGA support for application (and communication) acceleration, with the ultimate aim of operating at exascale within a reasonable power budget. Some examples of the programming issues faced on such a system will be discussed, along with the strategies currently being investigated to support the porting of application codes from the ESM community to the pre-exascale testbeds that will emerge from EuroEXA over the next few years.

 

Code optimizations and the accumulated impact on scientific throughput of an HPC center

John Dennis, NCAR, US

We described the results of a concerted multi-year effort to optimize the community earth system model for current and future generation HPC compute platforms.  Our approach involves a strategic and focused intervention to address performance issues in a rapidly evolving scientific code base.  While incremental in nature, it has yielded significant decreases in the cost to deliver climate science.

DYNAMICO, the IPSL icosahedral dynamical core : status and outlook

Thomas Dubos, LMD/IPSL, FR

DYNAMICO is firstly a hydrostatic dynamical core designed for numerical consistency and scalability, and is being integrated into IPSL-CM along the current production dynamical core, LMDZ5. In the last couple of years, it has been extended to fully compressible, non hydrostatic dynamics. Planned or under development are the support of fully unstructured meshes and limited-area domains. I will outline the computational design of DYNAMICO and present some performance metrics of the dynamical core, standalone and embedded in IPSL-CM.