Twelve PhD projects have been funded by the DFG in the 1st cohort of the RTG π3. In addition, twelve associated doctoral students worked on their PhD projects. In the following, the DFG funded PhD projects within and on the interface between the three research areas Inverse problems (R1), Direct optimisation (R2) and Data analysis (R3) are shortly described.
PI: Armin Lechleiter, Alfred Schmidt Start: 16.11.2016 Defense: 07.10.2019
R1-1: Dynamic Inverse Problems for Wave Phenomena
Inverse problems with dynamic parameters are a new field of research. The literature even lacks suitable theory for the partial differential equation, which allows for time-dependent wave speeds and densities. Nevertheless, there are existing results on the reconstruction of a timedependent potential, like the unique identifiability from Neumann-to-Dirichlet-data. We look at second order hyperbolic partial differential equations and equip them with time and space dependent parameters (e.g. wave speed, mass density). The task is to reconstruct these parameters from measurements of the wave field. As for stationary parameters, we expect this to be a ill-posed problem. For its inversion one needs to establish a forward operator. If this operator is Frèchet-differentiable, then it is appropriate to use Newton methods for the reconstruction.
R1-2: On the Connection Between Sparsity and Tolerances in Parameter and Data Space
We consider a modified Tikhonov functional for the solution of nonlinear ill-posed inverse problems. Motivated by applications in the field of production engineering, we allow small deviations in the solution, which are modeled through a tolerance measure acting inside the regularization term of the functional. We study the well-posedness of minimizers for such a functional and, furthermore, examine connections to sparsity regularization. In order to enhance a sparse structure in the reconstructed solution, we adopt the idea of elastic-net regularization and consider an extra penalty term, which accounts for promoting sparsity. Apart from the theoretical analysis, the task is to find an optimal combination for the regularization parameters and its relation to the magnitude of the chosen tolerances.
PI: Armin Lechleiter, Peter Maaß Start: 16.12.2016 Defense: 12.11.2019
R1-4: Direct and Inverse Electromagnetic Scattering Problems for Locally Perturbed Periodic Media
Inverse scattering from surface structures is traditionally based on data gained from illuminating periodic structures by quasi-periodic incident waves. Such a periodic setting breaks down if periodicity of either the surface or the incident field is perturbed, as it is frequently the case in applications. In view of our long-term goal to design algorithms for parameter identification of aperiodic nano-grass structures, we aim to design in a first step minimisation-based algorithms for the identification of local perturbations on a periodic dielectric surface structure.
PI: Peter Maaß Start: 16.10.2017 Defense: 02.11.2020
R1-5: Inverse Problems and Deep Learning
The challenge and hypothesis can be succinct into the following two questions. Can one use concepts from inverse problems to understand deep learning? And can one use deep learning to solve inverse problems? There is a lot of research going with the aim to explain and understand deep learning. But there is not much research that looks through the lens of inverse problems at deep structures. On the application side the field of solving inverse problems with deep learning methods is a very hot field of research. Many people from the inverse problem community are starting to use deep learning and also many people from the deep learning community are starting to look into inverse problems. Many of the best reconstructions are nowadays produced by deep learning techniques. Unfortunately there is not a lot of research done that investigates theoretical guarantees for these reconstructions and also most of the techniques require massive amounts of data of already created reconstructions, which poses a chicken-egg-like problem.
R2-1: Towards Finding Preferable Minima in Nonlinear Programming by Decomposition Techniques
Applications to Parameter Identification in Dynamical Systems and Bilevel Programming
Today, highly sophisticated real-world representations find their way into many fields of application. This may include the usage of dynamical systems (ordinary differential equations (ODEs)) or arbitrary nonlinear model functions, for instance. The systems of interest have in common that a numerical solution routine needs to be applied for their evaluation, for example numerical integration or root finding algorithms. Their accuracy often comes at the price of high computational efforts. In a subsequent step, a practitioner aims to optimise them and therefore, these systems are incorporated into nonlinear optimisation algorithms. Herein, they need to be run at a high frequency and certain smoothness properties are assumed. Resulting problems (among others) are instability during the numerical solution process or running into the next local minimum instead of a preferable one. This research project aims at finding advanced problem formulations that are able to overcome the mentioned problems. One of the consequences of complex model formulations is the occurrence of multiple local solutions. This can have several reasons, for instance the non-convexity of the resulting nonlinear program. Approximations, reductions, or simplifications often make it impossible to compute a global solution. A practitioner is therefore more interested in solutions that fulfil other criteria, too. This opens the concept of preferable solutions. Roughly speaking, a solution is preferable if it dominates other local solutions with respect to different criteria. The final hypothesis is that adjusted, decomposed problem formulations lead, on average, to the computation of preferable solutions when they are used within optimisation algorithms.
PI: Peter Maaß Start: 16.03.2017 Defense: 07.03.2022
R2-2: Nonnegative Matrix and Tensor Factorisations
Theory and Selected Applications
Matrix factorisation methods for large scale data sets have seen increasing scientific interest and have been revived recently due to their central role for a large variety of machine learning tasks, e.g. dimension reduction, data compression, basis learning, feature extraction as well as higher level problems such as classification or clustering. It addresses the challenge of computing a low-rank approximation of a typically large data matrix by factorising it into two smaller matrices. Motivated by medical imaging applications, where the data under investigation often naturally satisfies a non-negativity constraint, we focus on the particular case of nonnegative matrix and tensor factorisation (NMF/NTF) methods. The main aim is to review the theory of NMF and NTF solution methods, the development of new algorithms adapted to the corresponding application, the comparison to state of the art methods and to find new connections to approaches different from the NMF.
R2-3: Non-linear Dynamics and Parameter Identification
Bifurcation theory relates to the problem of parameter identification of a non-linear dynamical system, where the knowledge of critical parameter regimes may be used to compensate the lack of data and direct data collection. Combining the derivation of thermodynamic parts together with physical parameters and components, one gets the ingredients for many models of physical processes. The unknown details of the physical parameters are, for instance, modelled by fitting parameterised functions of data or assuming quasi-equilibrium. This fact can be interpreted as a freedom of choice and thus, we aim to study the consequences of different choices on the dynamics, classifying the bifurcation structure. A classical system of such a type, which possesses unspecified components with structural constraints other than symmetries, are of mass-damper-spring type and arise in marine craft modelling. To study the effects of incompletely modelled components requires specification in order to be feasible. The starting point for the project concerns to analyse the general system of non-linear equations of motion of vessels, where ODE models for ship manoeuvring are investigated, dealing with local bifurcations. The stability and bifurcations of equilibrium states will be analysed. The general Hopf bifurcation is studied and, in particular, the conditions for sub- vs. supercriticality. Furthermore, the initial model will be extended to models including more degrees of freedom of a vessel (yaw and roll motion and/or rudder). The analysis will shift towards global homo- and heteroclinic bifurcations, considering certain manoeuvres as boundary value problems analytically as well as numerically. In particular, local invariant manifolds of equilibria are going to be considered. An application for the latter is ship manoeuvring, where it could also be extended to automotive test problems and bifurcations at non-convex Pareto fronts in Optimal Control, dealing with multiobjective optimisation.
R2-4: Online Optimal Parameter Identification in Dynamic Controlled Environments
This PhD project aims at creating efficient algorithms for optimal parameter identification problems that result from complex model design problems in controlled dynamical systems. Commonly, parameter identification is performed strictly a priori to system operation, but often parameter updates are required during system operation, e.g. for time-varying parameters that change unpredictably due to external influences. Control methods such as model predictive control (MPC) heavily dependent on a good model of the dynamical system, because for the computation of future control inputs the system state has to be predicted. In this project parameter identification is included in the MPC loop, such that time-varying parameters are updated online. The unknown parameters are either system parameters that can only be identified in non-static system states or a discrete, partial high-dimensional representation of a control curve or even a combination of both. This combination of fast direct control methods with online parameter identification leads to efficient algorithms for real-time optimal parameter identification in dynamic controlled scenarios. We also target theoretical results for the convergence of the developed algorithm in the linear case. Moreover, the parameter identification goal (minimising the distance between model and measurements) has to be combined with task performance criteria of the control system.
PI: Peter Maaß Start: 01.10.2016 Defense: 14.02.2020
R3-2: Double Backpropagation with Applications to Robustness and Saliency Map Interpretability
Double backpropagation arises in derivative-based optimisation, whenever a neural network’s loss function contains derivatives of output nodes with respect to input nodes. This work introduces a very general, coordinate-free Hilbert space description of the involved phenomena, instead of descriptions for specific examples such as dense layers. For this, a theory of adjoints of continuous bilinear operators in Hilbert spaces was developed. This comes into play for standard layer types (dense, convolutional, locally-connected etc.), which contain linear relationships both in the parameters as well as in the inputs. Apart from the in-depth description of double backpropagation, an optimised double backpropagation scheme for Jacobian penalties in the case of ReLU networks is presented, which reduces the number of necessary computations by up to a third. Moreover, the discontinuous loss landscape of ReLU networks is characterised for several types of losses. A ’pseudo-smoothing’ effect on this loss landscape when using batch optimisation is described, which explains why the discontinuities usually do not pose a problem in applications.
PI: Werner Brannath Start: 01/10/2016 Defense: 16.12.2019
R3-4: Model Selection and Evaluation in Supervised Machine Learning
The major target of research area R3 "Mathematical data analysis" is the development of numerically efficient algorithms for low dimensional feature building and corresponding classification methods that utilise high dimensional, sparse input data from applications like MALDI imaging. During a typical (supervised machine-learning) project, several classification algorithms will be developed that finally need to be fine-tuned and evaluated by real application studies. Furthermore, the different methods need to be compared with regard to their operating characteristics. This leads to the question of how to best design experimental studies for an adjustment, evaluation and comparison of classification algorithms such that costs like study time and number of subjects are minimised. This objective should be pursued under the requirement that bias, optimism and type I error inflations are largely avoided.
R3-5: Statistical Procedures and Numerical Algorithms for Analyzing MALDI Data
Matrix Assisted Laser Desorption/Ionisation Time of Flight (MALDI-TOF) is a widely used imaging mass spectrometry (IMS) technique. It is a "tool for spatially-resolved chemical analysis of diverse sample types ranging from biological and plant tissues to bio and polymer thin films" (Alexandrov 2012). The measured spectra are however not easily interpretable: For example "very few of the peaks in any whole bacteria mass spectrum can be attributed to only one protein" and "initial conditions for the ions [...] contribute to a loss of mass (or time) resolution" (Liyanage et. al. 2006). Thus it is insightful to interpret the mass spectra as corrupted measurements of a mixture of (biologically or chemically) meaningful signals. The goal of the PhD project is to develop statistical and numerical methods to extract (meaningful) features from such "mixtures" and improve the performance of classifiers. A focus will be on Non-Negative Matrix Factorisation (a type of Blind Source Separation (BSS)) based on the Wasserstein-Distance (since these procedures have desirable properties when applied to mixtures). The developed methods will be applied to hyper-spectral images obtained using MALDI-TOF imaging.
PI: Emily King Start: 01.07.2018 Defense: 03.02.2022
R3-6: Blind Source Separation in Polyphonic Music Recordings
Given a recording of a polyphonic piece of music, the challenge is to separate the contributions of the instruments, such that an individual audio track for each instrument is obtained. In blind separation, no prior information is given about the sounds of the instruments is given. Instead, we rely on dictionary learning. In the literature, good results have been obtained with dictionaries that represent each tone of an instrument separately, when polyphony is very limited. There have also been results based on tensor factorisation, which allow dictionary entries for a specific instrument to be applied at varying pitch. However, these require the assumption of a known equal-temperament tuning of the instruments. The use of the Wasserstein distance as a loss function is also receiving some attention. Many of the pitch-invariant methods require the use of a time-frequency representation with a logarithmic frequency axis. For this, methods based on the STFT and on wavelet transforms have been developed.
Ten PhD projects are currently funded by the DFG in the 2nd cohort of the RTG π3. In addition, six associated doctoral students work on their PhD projects. In the following, the DFG funded PhD projects within and on the interface between the three research areas Inverse problems (R1), Direct optimisation (R2) and Data analysis (R3) are shortly described.
R1-3: Efficient and reliable discretisation of primal and adjoint operators
Accurate and efficient discretisation in terms of computing time and memory usage as well as the solution of primal and adjoint problems are still scientific challenges, especially for dynamic inverse problems and parameter identification of time-dependent PDE coefficients. Efficient approaches should include adaptive methods for non-linear problems with problem-dependent error criteria and concepts for efficient storage of adjoint solutions.
To prepare for tackling problems and applications of the research network, first the stationary situation will be considered and results from linear-quadratic control problems (distributed or boundary control) should be transferred to inverse parameter identification problems, like identification of space-dependent diffusion coefficients.
R1-6: Deep Image Prior Approaches to Inverse Problems
The research topic is located around the Deep Image Prior (DIP), which is a relatively new image reconstruction method using a neural network, which does not require training prior to reconstruction. One main aim of the project is to determine analytical properties of the DIP and to embed it into the classical regularisation theory. A relevant application for DIP approaches is Computed Tomography (CT).
Classical Tikhonov regularization of inverse problems is based on minimizing functionals Φ (x) = dist(F(x), yδ) + α R(x) which balance a discrepancy term dist(F(x), yδ), measuring the mismatch on the data side, and a regularization or penalty term R(x), which encodes some information of the searched for parameter. The discrepancy term typically incorporates the norm of the image space of F : X → Y as well as information on the noise model for yδ, the penalty term aims at matching the a priori distribution of the parameter, which typically is restricted to an unknown subset or submanifold of X. Recently, motivated by applications in engineering tolerance based discrepancy and penalty terms have been investigated (Type I). These tolerance measures take zero values in a certain region around the origin, which allows to incorporate expert knowledge on y and/or on x. In this project we aim at investigating Tikhonov functionals based on three terms, which separately measures data misfit, tolerance based fidelity in the data term.
R2-5: Convergence to Preferable Minima in Real-Time Nonlinear Optimization Problems
The optimisation of dynamical control systems leads to challenging parameter identification (PI) problems for the control signals. This is due to a number of reasons: (1) typically, PI problems are highly non-linear in the constraints, such that there exists many local optima, (2) control signals are infinite-dimensional parameters which have to be approximated adequately by numerical PI methods, (3) oftentimes, e.g. in autonomous driving applications, real-time constraints apply. Thus, finding global optimal solutions is virtually impossible; instead, one is interested in finding a preferable local optimum.
R2-6: Deep Learning based Lane Segmentation and Model Predictive Lane Following
To achieve autonomous driving in complex urban contexts environment perception is key. By means of computer vision the driveable area must be identified, other traffic participants detected, classified and their actions anticipated and based on this, control values for the car calculated. Due to the recent advances in both powerful hardware and algorithmic design, especially in the field of machine learning, the aim is to develope a neural network that is both state-of-the-art in segmentation accuracy and real time capable. On top of that, since segmentation masks require costly post processing to handle identification of object instances such as persons or vehicles, the network will be trained not only to perform segmentation, but also make predictions similar to a generic object detector.
R2-7: Parameter and Structure Identification for Complex Dynamical Systems
This project addresses joint identification problems for both the structure and the parameters of a model for dynamic processes or systems. Complex dynamical systems often consist of several subsystems which belong to different physical fields (multiphysical systems). The subsystems come along with different established modelling approaches, e.g. multi-body systems in mechanics, circuit models in electronics, and CFD models in fluid dynamics. However, for every submodel, the accuracy, i.e. order, dimensions, or time-scale, has to be chosen. Thus, there does not exist one single model-structure, for which parameter identification (PI) has to be performed. Instead, PI has to be combined with structure identification (SI). Moreover, real-world technical systems also possess interfaces or subsystems that can only be modelled by generic black-box models, i.e. look-up-tables in the simplest case. While PI problems can be addressed by efficient nonlinear optimisation solvers, SI problems might lead to large combinatorial problems and finding the optimal structure is a machine learning task. Within this PhD project, suitable methods for large-scale nonlinear optimisation shall be combined with machine learning techniques, such that SI and PI problems cannot only be solved subsequently, but also in a combined problem setting.
R3-1: Application of Δ-complexes in topological data analysis
The basic idea of topological data analysis is to first construct a topological complex based on some given data which can then be analysed using common tools of algebraic topology, one of the most prominent examples being the persistent homology. For this project, the first of these steps of topological data analysis is of interest, more specifically the type of complex that is usually constructed. Most of the time these complexes are simplicial complexes, for example an Alpha-complex or a Witness-complex, which are easy enough to work with. Now, the starting point of this project will be to consider a different kind of complex. A Δ-complex is constructed by gluing together a collection of simplices such that certain conditions are met. The difference to a simplicial complex is that these conditions allow for more possible complexes while also including any possible simplicial complex which makes Δ-complexes a generalization of said complexes. The simplicial homology that is needed to analyse the simplicial complexes in the topological data analysis is also defined on Δ-complexes which ensures that the second step of the analysis could be carried on as before. The question that arises out of this train of thought is what application these types of complexes could have and if they have any advantages over or disadvantages compared with the commonly used simplicial complexes.
R3-7: Post-selection Inference for Prediction Performance
In many cases, statistical hypotheses testing and interval estimation of prediction performance measures for statistical prediction models are required or at least desirable in order to overcome the overestimation of a model’s ability to predict the outcome of future observations. The aim of this research project is to develop methodology for an exact or at least asymptotically valid post-selection inference for prediction performance measures. Usually, the validation of the prediction performance of the predictor is deferred to a separate and independent validation study, which is both costly and time consuming. This admits of improvement in terms of simplicity and the validity of the study in which the predictor has been developed. Statistical methods and tools for a (asymptotically) valid post-selection inference of model prediction performance could accomplish this kind of improvement.
R3-8: Two-sample Testing for the Comparison of MALDI Spectra
Previous results indicate that there exist strong dependencies among peak data in (averaged) MALDI spectra. The challenge that we want to tackle in the project is to take into account or even exploit these dependencies in the design of multiple test procedures for the comparison of two (averaged) MALDI spectra corresponding to, e.g., two different states of a disease or two different experimental conditions. In principle, it is also possible to extend the scope to k-sample problems with k > 2 groups which may correspond, for example, to different disease stages or to different dosages of a drug.
R3-9: Discrete Morse Theory in the Topological Data Analysis
The field of topological data analysis tries to characterize data in a topological way. The challenge in this subject is that the data is often incomplete or even sparse, loud, and higher-dimensional. Even so, the data has often lower-dimensional properties which can be detected and described. The project is about these properties and tries to apply methods from topology to simplify the analyses.
R3-10: Learning Tikhonov Functionals for Inverse Problems
We consider classical inverse problems given by a linear or non-linear operator A : X → Y which maps x ∈ X to y ∈ Y. The related inverse problem asks to determine an approximation to x † from given noisy measurements yδ = Ax† + η. This can be achieved by minimizing Tikhonov functionals of type ℒ(x) = ||Ax – yδ||2 + α R(x) (1), where R: X → ℝ is called regularization term and α ∈ ℝ+ is a scalar weighting factor. Traditional approaches use handcrafted regularizers to introduce certain properties to the solution. A common choice is R(x):= ||x ||1 , which forces possible solutions of the minimization problem (1) to be sparse. For academic examples, one can achieve excellent results with these basic handcrafted regularizers. However, in many cases much more advanced methods are needed on real data. This is even clearer when we use the statistical view on inverse problems. If we are interested in the maximum a posteriori estimator, the R in (1) has to be the logarithm of the prior distribution, for example the distribution of natural computed tomography reconstructions. It is unclear how one would design such a prior by hand. A recent approach to overcome this problem is to learn a parameterized version Rϑ directly from data. We choose artificial neural networks as a general model class for this task due to their approximating power and great success in a large variety of applications. The aim of this project is to investigate the choice and influence of different network architectures on the regularizing properties of Rϑ. In addition, the number of data points needed to train such a model is of crucial interest. To keep it as low as possible, novel approaches to combine classical modelling and learning from data will be investigated. Computed tomography (CT) will be used as main application case in this project. CT is a prime example of an ill-posed inverse problems. It is physically well understood, but creating good reconstructions remains challenging in low-dose cases and for a limited number of measurement angles. The (learned) Tikhonov functional approach will be compared against a number of other reconstruction methods, including ones that also feature learning.
R3-11: Causal identification and inference in time-to-event settings
This project is concerned with developing theory and methods for studying causal relations in time-to-event data. Graphical models will provide a framework for causal structures, and in order to handle the asymmetric dependencies of dynamic data, we will focus on local independence graphs. Intervention is a central concept for causal identification and we will formalize the concept of intervention in local independence graphs, which we will use to extend graphical criteria for identification. Furthermore, we will investigate how to use local independence graphs to solve advanced causal identification issues, such as identification of separable effects in the context of mediation analyses and identification of suitable causal parameters in multistate models. Moreover, we will develop method for applications to medical and epidemiological questions, such as treatment of chronic diseases or competing risks situations in cancer research.
In the following, the DFG funded PhD projects of the 3rd cohort of the RTG within and on the interface between the four research areas Dynamic inverse problems (R1), Direct optimization (R2), Data analysis (R3) and Statistics (R4) are shortly described.
The motion of the studied object strongly influences the information contained in the measured data. For a static object, the data retrieved during one single circumnavigation by the measuring device (e.g., the radiation source) encodes all the object’s characteristic features. For objects moving during the measurement, this is typically not true. In order to understand the interplay between the time-dependency of the scanning process and the dynamics of the object a thorough mathematical analysis is required. Within a continuous setting, techniques from microlocal analysis can be applied. However, data sets in practical applications are of discrete nature. Therefore we propose to combine microlocal analysis with (discrete) dynamical systems theory to analyse the overall information content of dynamic data. Such a rigorous mathematical study can serve as a basis for designing efficient and motion-adapted data sampling schemes for applications.
R1-2: Orientation and Concentration Reconstruction for Immobilised Nanoparticle Markers in Magnetic Particle Imaging
We aim for the reconstruction of the orientation and concentration of immobilised nanoparticle markers in Magnetic Particle Imaging (MPI) based on noisy voltage measurements by solving the corresponding nonlinear joint parameter identification problem. This results in reconstructing space-time-dependent quantities implying a change during the measurement process which – even with strong a priori assumptions – prevents a reconstruction on the same temporal scale. However, this is also not desirable from the application perspective since the temporal resolution is several orders coarser than the data acquisition.
After a clear definition of the time-dependent setup taking into account the different time scales, we start with an investigation of the static joint reconstruction problem. A solution will be obtained by using a variational approach in terms of a Tikhonov-type functional. Aiming for a gradient-based reconstruction scheme we will formulate, analyse, and solve the problem for the Fréchet derivative in a suitable function space setup. Suitable penalty terms need to be identified for the particular application and additional side constraints need to be included. In the second step the problem is extended to the semi-time-discrete problem taking into account the evolution of the concentration and the easy axis orientation on the coarser time scale.
R2-1: Development and Optimization of Models for Congested Systems
Today’s world is more interconnected than it has ever been. Control of both digital, as well as physical traffic, and optimization of the underlying infrastructure is of central importance. Mathematical models for these infrastructures play a huge role in the optimization tasks. Solution concepts have to include strategic behavior of users and should cover additional mathematical properties. We aim for the development of realistic mathematical models for congested systems and especially the identification of meaningful parameters for the users’ objective functions that highly influence the arising stable solutions. We are interested in both the techniques for the efficient calculation of solutions and the optimization of the infrastructure itself.
R2-2: Efficient Numerical Methods for Optimization of Multi-Scale Models
In many situations, a macroscopic model is heavily dependent on microscopic situations. Well-known examples are porous media, but also models for industrial applications like grinding have strong two-scale effects, where equations and parameters in macroscopic models are based on the microscopic situation. Important optimisation tasks are typically parameter identification in the macroscopic model and subsequent (process) optimisation of the macroscopic model. Efficient numerical approaches like adaptive methods are needed for the utilisation of multi-scale models in process simulation and optimisation. Additional challenges are given by time dependent or/and statistically perturbed data like domain boundaries, material properties etc. in the microscopic model. In usual two-scale models for boundary conditions at rough boundaries, the domain width is much larger than the roughness. In the application to grinding, the width of the typical micro scale geometry is comparable to the size of abrasive grains giving the roughness, thus an extension of the standard approach is necessary.
The goal of the PhD project is the derivation and analysis of efficient numerical multi-scale optimisation methods. This includes parameter identifications in micro and macro models as well as optimisation of the macro process. Approaches should be based on suitable adaptive finite element methods on both micro and macro scales. This can be extended to time dependent variations of possibly perturbed microscopic data, using quasi-stationary micro problems with time dependent parameters from the macro problem or fully time dependent multi-scale methods.
R2-3: Parameter Identification on Time-dependent Domains Using Adaptive Finite Cell Methods
Partial differential equations with underlying time-dependent domains arise in the modelling of many processes. One example is given by industrial production processes like drilling, grinding, or milling. The shape of the workpiece is modified by the applied forces, by the induced heat, and by the removal of material. While the deformation due to applied forces and due to induced heat can be represented by thermoelastic material models, the removal of the material leads to a change of the underlying domain. The efficient simulation of such processes is a challenge for itself. One approach to handle these difficulties is the finite cell method (FCM). It combines the usual finite element method with a fictitious domain approach to represent the underlying domain involving a special numerical quadrature to treat the arising discontinuous integrands. However one has to control the different arising errors to obtain a reliable simulation result. In the production processes mentioned above, the precise simulation of the material removal is extremely time-consuming or even nowadays impossible. Hence simpler models are used. However they rely typically on several parameters which cannot be determined directly by measurements. Other models describe the interaction of the workpiece with the environment which is not explicitly considered and is represented only by general parameters. Thus, there is a strong need to identify these parameters numerically in combination with the underlying time-dependent domain.
The main objective of this PhD project consists in the development of error controlled adaptive methods to identify parameters in parabolic problems with underlying timedependent domains.
R3-1: Regularization Properties of Invertible Residual Networks
Applying i-resNets for solving inverse problems F(x) = y proceeds in two steps. First, the forward operator F is trained using training data. Second, after training and for given new data yδ a solution to the inverse problem is computed by inverting the network via successive fixed point iterations. This approach has the advantage that the training of the forward operator is stable (as opposed to training the inverse) and the inversion is also stable due to the Lipschitz restriction L on the activation function. Hence, L plays the role of a regularization parameter.
The challenge addressed in this PhD project is to embed this setting into the classical regularization theory for inverse problems and to analyze the regularization properties of invertible networks for solving inverse problems. We aim at determining an equivalence between certain network architectures and regularization filters for linear inverse problems. Furthermore, we aim at characterizing assumptions on non-linear operators, which allow an extension of these regularization results to non-linear inverse problems.
R4-1: Statistical Inference for Machine Learning Model Interpretation
Flexible supervised machine learning algorithms perform well in terms of prediction precision, however lack in interpretability due to their black-box nature and utility to cover complex dependencies and interactions that are difficult to overlook and appraise by humans. While predictions are important, drawing conclusions from the model (and data) by exploring relationships between the target variable and the features is of high value for many applications. Several methods have been suggest to tackle machine learning model (ML-model) interpretation. ML-model interpretation is to be distinguished between intrinsic and post-hoc, whereby intrinsic interpretability applies only to traditional approaches like regression trees. Post-hoc methods are further distinguished into model-specific or model-agnostic, i.e. either apply only to a specific machine learning methodology (like neural networks) or are applicable independently of the method under use. In this project, we will focus on model-agnostic (post-hoc) methods and aim to enrich the literature by statistical inference methodology (i.e. frequentist confidence bands and hypothesis tests and their Bayesian counterparts) going beyond the currently overwhelming focus on descriptive numbers and graphs. The main challenges with this enterprise are, on the one hand, the appropriate choices of suitable null distributions and prior probabilities and, on the other hand, the development of valid and numerically feasible resampling and Monte Carlo methods. Moreover, when aiming on interpretations that go beyond the learned model and focus on characteristics and hypotheses for the unknown true data generating process, we need to account for the machine learning process itself and the post-selection of the characteristics and hypotheses we are finally interested in. This is are quite challenging task, as can be seen in the literature on post-selection inference.
R4-2: Structural Statistical Inference for Signal Detection in High-Dimensional Data with Applications to Functional Magnetic Resonance Imaging
We plan to extend existing structural inference methods for functional magnetic resonance imaging data analysis in terms of coping with several layers of hierarchy in regions of interest, enabling powerful all-resolutions inference, and exploiting the dependency structure among spatial locations (voxels).
In the proposed project, we aim at three extensions of the methodology by Schildknecht et al. (2016). First, we want to extend their approach to cases, where the (pre-defined) regions of interest possess further sub- or super-structures, leading to a hierarchically structured system of hypotheses with several layers. By exploiting techniques related to the closure principle (Marcus et al. (1976)), we will derive multiple tests which can test the additional null hypotheses induced by the sub- / superstructures with a negligible loss of power. Second, we aim at "transductive inference" in the sense, that regions of interest defined by the data analysis of one sub-study can be taken as the basis for structural inference in a second sub-study. While the mathematical validity of this approach can be established straightforwardly, given that the sub-studies comprise different (independent) observational units, there are some subtle details like standardisation of brain architectures, which are to our knowledge not completely satisfactorily resolved up to now. Third (and most ambitious), we want to contribute to the theory of all-resolutions inference for fMRI data in the sense of Rosenblatt et al. (2018).