MarDATA - Machine learning in climate modelling

MarData Logo

Researcher: Sonal Rami
Project funding: Helmholtz Gesellschaft
Partner: Thomas Jung, Alfred Wegener Institut, Bremerhaven
Duration: 01.12.2019 - 30.11.2022

Computer simulations cannot fully resolve the full dynamics of the real world - spatial and temporal discretization will always be necessary, requiring parameterizations of scales smaller than the discretization scale. These parameterizations are usually obtained from a limited number of observations that are not necessarily representative of the entire globe, and are often strongly tuned to obtain a final solution that is close to the observed mean state of the climate system. Machine learning (ML) methods enable the development of parameterizations that can outperform classical methods and provide much more robust solutions (e.g., Reichstein et al, 2019 and references therein). In this context, one of the main challenges - affecting both the marine (physics) and computer science domains - is to constrain these parameterizations by physical laws so that key conservation principles are not violated. In addition, sea-ice ocean models such as FESOM have certain bottlenecks in high-resolution configurations that seriously hinder scalability and thus effective use of next-generation exascale supercomputers. In this context, it has been proposed to replace computationally intensive parts with machine learning approaches to make models exascale-ready.

This project will test two main hypotheses: (1) machine learning will lead to much better parameterizations in climate models, and (2) machine learning methods can help overcome computational bottlenecks in high-resolution model runs on extreme high-performance computers. First, we plan to develop new parameterizations to represent ocean eddies in the Finite Volume Sea Ice-Ocean Model (FESOM2) developed at AWI. To this end, models that explicitly resolve eddies at ultra-high resolution will be used as examples for training the ML model. In addition, observational knowledge will also be included. Second, it is planned to go a step further by using ML methods to replace certain parts of Earth system models that are computational bottlenecks in high-resolution configurations. A good candidate in this context is the sea ice model. As a first step, it is planned to train the ML model on the same or a high-resolution version of the same sea ice model taken as the "true" solution. Furthermore, it is planned to investigate how satellite observations can be integrated into the ML model. In general, the integration of additional domain-specific information available as data (e.g., satellite data) or physical models (e.g., describing the basic physical laws of ice formation) requires specific network structures and adapted training procedures. Here, we particularly aim at obtaining very high-resolution models where the underlying mathematical model leads to an ill-posed inverse problem. This requires the adaptation of regularized deep learning approaches based on generative adversarial network architectures (Arridge et al. 2019).

Objectives in this project are: (1) improve the realism of climate models and thus their ability to project future climate change, (2) develop parameterizations of subgrid-scale processes based on ML methods for FESOM2, and (3) overcome computational bottlenecks in FESOM2 by replacing some of the code with ML approximations.

More information at: