PhD projects
Twelve PhD projects have been funded by the DFG in the 1st cohort of the RTG π3. In addition, twelve associated doctoral students worked on their PhD projects. In the following, the DFG funded PhD projects within and on the interface between the three research areas Inverse problems (R1), Direct optimisation (R2) and Data analysis (R3) are shortly described.
R1-1: Dynamic Inverse Problems for Wave Phenomena
Inverse problems with dynamic parameters are a new field of research. The literature even lacks suitable theory for the partial differential equation, which allows for time-dependent wave speeds and densities. Nevertheless, there are existing results on the reconstruction of a timedependent potential, like the unique identifiability from Neumann-to-Dirichlet-data. We look at second order hyperbolic partial differential equations and equip them with time and space dependent parameters (e.g. wave speed, mass density). The task is to reconstruct these parameters from measurements of the wave field. As for stationary parameters, we expect this to be a ill-posed problem. For its inversion one needs to establish a forward operator. If this operator is Frèchet-differentiable, then it is appropriate to use Newton methods for the reconstruction.
R1-2: On the Connection Between Sparsity and Tolerances in Parameter and Data Space
We consider a modified Tikhonov functional for the solution of nonlinear ill-posed inverse problems. Motivated by applications in the field of production engineering, we allow small deviations in the solution, which are modeled through a tolerance measure acting inside the regularization term of the functional. We study the well-posedness of minimizers for such a functional and, furthermore, examine connections to sparsity regularization. In order to enhance a sparse structure in the reconstructed solution, we adopt the idea of elastic-net regularization and consider an extra penalty term, which accounts for promoting sparsity. Apart from the theoretical analysis, the task is to find an optimal combination for the regularization parameters and its relation to the magnitude of the chosen tolerances.
R1-4: Direct and Inverse Electromagnetic Scattering Problems for Locally Perturbed Periodic Media
Inverse scattering from surface structures is traditionally based on data gained from illuminating periodic structures by quasi-periodic incident waves. Such a periodic setting breaks down if periodicity of either the surface or the incident field is perturbed, as it is frequently the case in applications. In view of our long-term goal to design algorithms for parameter identification of aperiodic nano-grass structures, we aim to design in a first step minimisation-based algorithms for the identification of local perturbations on a periodic dielectric surface structure.
R1-5: Inverse Problems and Deep Learning
The challenge and hypothesis can be succinct into the following two questions. Can one use concepts from inverse problems to understand deep learning? And can one use deep learning to solve inverse problems? There is a lot of research going with the aim to explain and understand deep learning. But there is not much research that looks through the lens of inverse problems at deep structures. On the application side the field of solving inverse problems with deep learning methods is a very hot field of research. Many people from the inverse problem community are starting to use deep learning and also many people from the deep learning community are starting to look into inverse problems. Many of the best reconstructions are nowadays produced by deep learning techniques. Unfortunately there is not a lot of research done that investigates theoretical guarantees for these reconstructions and also most of the techniques require massive amounts of data of already created reconstructions, which poses a chicken-egg-like problem.
R2-1: Towards Finding Preferable Minima in Nonlinear Programming by Decomposition Techniques
Applications to Parameter Identification in Dynamical Systems and Bilevel Programming
Today, highly sophisticated real-world representations find their way into many fields of application. This may include the usage of dynamical systems (ordinary differential equations (ODEs)) or arbitrary nonlinear model functions, for instance. The systems of interest have in common that a numerical solution routine needs to be applied for their evaluation, for example numerical integration or root finding algorithms. Their accuracy often comes at the price of high computational efforts. In a subsequent step, a practitioner aims to optimise them and therefore, these systems are incorporated into nonlinear optimisation algorithms. Herein, they need to be run at a high frequency and certain smoothness properties are assumed. Resulting problems (among others) are instability during the numerical solution process or running into the next local minimum instead of a preferable one. This research project aims at finding advanced problem formulations that are able to overcome the mentioned problems. One of the consequences of complex model formulations is the occurrence of multiple local solutions. This can have several reasons, for instance the non-convexity of the resulting nonlinear program. Approximations, reductions, or simplifications often make it impossible to compute a global solution. A practitioner is therefore more interested in solutions that fulfil other criteria, too. This opens the concept of preferable solutions. Roughly speaking, a solution is preferable if it dominates other local solutions with respect to different criteria. The final hypothesis is that adjusted, decomposed problem formulations lead, on average, to the computation of preferable solutions when they are used within optimisation algorithms.
R2-2: Nonnegative Matrix and Tensor Factorisations
Theory and Selected Applications
Matrix factorisation methods for large scale data sets have seen increasing scientific interest and have been revived recently due to their central role for a large variety of machine learning tasks, e.g. dimension reduction, data compression, basis learning, feature extraction as well as higher level problems such as classification or clustering. It addresses the challenge of computing a low-rank approximation of a typically large data matrix by factorising it into two smaller matrices. Motivated by medical imaging applications, where the data under investigation often naturally satisfies a non-negativity constraint, we focus on the particular case of nonnegative matrix and tensor factorisation (NMF/NTF) methods. The main aim is to review the theory of NMF and NTF solution methods, the development of new algorithms adapted to the corresponding application, the comparison to state of the art methods and to find new connections to approaches different from the NMF.
R2-3: Non-linear Dynamics and Parameter Identification
Bifurcation theory relates to the problem of parameter identification of a non-linear dynamical system, where the knowledge of critical parameter regimes may be used to compensate the lack of data and direct data collection. Combining the derivation of thermodynamic parts together with physical parameters and components, one gets the ingredients for many models of physical processes. The unknown details of the physical parameters are, for instance, modelled by fitting parameterised functions of data or assuming quasi-equilibrium. This fact can be interpreted as a freedom of choice and thus, we aim to study the consequences of different choices on the dynamics, classifying the bifurcation structure. A classical system of such a type, which possesses unspecified components with structural constraints other than symmetries, are of mass-damper-spring type and arise in marine craft modelling. To study the effects of incompletely modelled components requires specification in order to be feasible. The starting point for the project concerns to analyse the general system of non-linear equations of motion of vessels, where ODE models for ship manoeuvring are investigated, dealing with local bifurcations. The stability and bifurcations of equilibrium states will be analysed. The general Hopf bifurcation is studied and, in particular, the conditions for sub- vs. supercriticality. Furthermore, the initial model will be extended to models including more degrees of freedom of a vessel (yaw and roll motion and/or rudder). The analysis will shift towards global homo- and heteroclinic bifurcations, considering certain manoeuvres as boundary value problems analytically as well as numerically. In particular, local invariant manifolds of equilibria are going to be considered. An application for the latter is ship manoeuvring, where it could also be extended to automotive test problems and bifurcations at non-convex Pareto fronts in Optimal Control, dealing with multiobjective optimisation.
R2-4: Online Optimal Parameter Identification in Dynamic Controlled Environments
This PhD project aims at creating efficient algorithms for optimal parameter identification problems that result from complex model design problems in controlled dynamical systems. Commonly, parameter identification is performed strictly a priori to system operation, but often parameter updates are required during system operation, e.g. for time-varying parameters that change unpredictably due to external influences. Control methods such as model predictive control (MPC) heavily dependent on a good model of the dynamical system, because for the computation of future control inputs the system state has to be predicted. In this project parameter identification is included in the MPC loop, such that time-varying parameters are updated online. The unknown parameters are either system parameters that can only be identified in non-static system states or a discrete, partial high-dimensional representation of a control curve or even a combination of both. This combination of fast direct control methods with online parameter identification leads to efficient algorithms for real-time optimal parameter identification in dynamic controlled scenarios. We also target theoretical results for the convergence of the developed algorithm in the linear case. Moreover, the parameter identification goal (minimising the distance between model and measurements) has to be combined with task performance criteria of the control system.
R3-2: Double Backpropagation with Applications to Robustness and Saliency Map Interpretability
Double backpropagation arises in derivative-based optimisation, whenever a neural network’s loss function contains derivatives of output nodes with respect to input nodes. This work introduces a very general, coordinate-free Hilbert space description of the involved phenomena, instead of descriptions for specific examples such as dense layers. For this, a theory of adjoints of continuous bilinear operators in Hilbert spaces was developed. This comes into play for standard layer types (dense, convolutional, locally-connected etc.), which contain linear relationships both in the parameters as well as in the inputs. Apart from the in-depth description of double backpropagation, an optimised double backpropagation scheme for Jacobian penalties in the case of ReLU networks is presented, which reduces the number of necessary computations by up to a third. Moreover, the discontinuous loss landscape of ReLU networks is characterised for several types of losses. A ’pseudo-smoothing’ effect on this loss landscape when using batch optimisation is described, which explains why the discontinuities usually do not pose a problem in applications.
R3-4: Model Selection and Evaluation in Supervised Machine Learning
The major target of research area R3 "Mathematical data analysis" is the development of numerically efficient algorithms for low dimensional feature building and corresponding classification methods that utilise high dimensional, sparse input data from applications like MALDI imaging. During a typical (supervised machine-learning) project, several classification algorithms will be developed that finally need to be fine-tuned and evaluated by real application studies. Furthermore, the different methods need to be compared with regard to their operating characteristics. This leads to the question of how to best design experimental studies for an adjustment, evaluation and comparison of classification algorithms such that costs like study time and number of subjects are minimised. This objective should be pursued under the requirement that bias, optimism and type I error inflations are largely avoided.
R3-5: Statistical Procedures and Numerical Algorithms for Analyzing MALDI Data
Matrix Assisted Laser Desorption/Ionisation Time of Flight (MALDI-TOF) is a widely used imaging mass spectrometry (IMS) technique. It is a "tool for spatially-resolved chemical analysis of diverse sample types ranging from biological and plant tissues to bio and polymer thin films" (Alexandrov 2012). The measured spectra are however not easily interpretable: For example "very few of the peaks in any whole bacteria mass spectrum can be attributed to only one protein" and "initial conditions for the ions [...] contribute to a loss of mass (or time) resolution" (Liyanage et. al. 2006). Thus it is insightful to interpret the mass spectra as corrupted measurements of a mixture of (biologically or chemically) meaningful signals. The goal of the PhD project is to develop statistical and numerical methods to extract (meaningful) features from such "mixtures" and improve the performance of classifiers. A focus will be on Non-Negative Matrix Factorisation (a type of Blind Source Separation (BSS)) based on the Wasserstein-Distance (since these procedures have desirable properties when applied to mixtures). The developed methods will be applied to hyper-spectral images obtained using MALDI-TOF imaging.
R3-6: Blind Source Separation in Polyphonic Music Recordings
Given a recording of a polyphonic piece of music, the challenge is to separate the contributions of the instruments, such that an individual audio track for each instrument is obtained. In blind separation, no prior information is given about the sounds of the instruments is given. Instead, we rely on dictionary learning. In the literature, good results have been obtained with dictionaries that represent each tone of an instrument separately, when polyphony is very limited. There have also been results based on tensor factorisation, which allow dictionary entries for a specific instrument to be applied at varying pitch. However, these require the assumption of a known equal-temperament tuning of the instruments. The use of the Wasserstein distance as a loss function is also receiving some attention. Many of the pitch-invariant methods require the use of a time-frequency representation with a logarithmic frequency axis. For this, methods based on the STFT and on wavelet transforms have been developed.
Ten PhD projects are currently funded by the DFG in the 2nd cohort of the RTG π3. In addition, six associated doctoral students work on their PhD projects. In the following, the DFG funded PhD projects within and on the interface between the three research areas Inverse problems (R1), Direct optimisation (R2) and Data analysis (R3) are shortly described.
R1-3: Efficient and reliable discretisation of primal and adjoint operators
Accurate and efficient discretisation in terms of computing time and memory usage as well as the solution of primal and adjoint problems are still scientific challenges, especially for dynamic inverse problems and parameter identification of time-dependent PDE coefficients. Efficient approaches should include adaptive methods for non-linear problems with problem-dependent error criteria and concepts for efficient storage of adjoint solutions.
To prepare for tackling problems and applications of the research network, first the stationary situation will be considered and results from linear-quadratic control problems (distributed or boundary control) should be transferred to inverse parameter identification problems, like identification of space-dependent diffusion coefficients.
R1-6: Deep Image Prior Approaches to Inverse Problems
The research topic is located around the Deep Image Prior (DIP), which is a relatively new image reconstruction method using a neural network, which does not require training prior to reconstruction. One main aim of the project is to determine analytical properties of the DIP and to embed it into the classical regularisation theory. A relevant application for DIP approaches is Computed Tomography (CT).
R1-7: Generalized Tolerance Regularization
Classical Tikhonov regularization of inverse problems is based on minimizing functionals Φ (x) = dist(F(x), yδ) + α R(x) which balance a discrepancy term dist(F(x), yδ), measuring the mismatch on the data side, and a regularization or penalty term R(x), which encodes some information of the searched for parameter. The discrepancy term typically incorporates the norm of the image space of F : X → Y as well as information on the noise model for yδ, the penalty term aims at matching the a priori distribution of the parameter, which typically is restricted to an unknown subset or submanifold of X. Recently, motivated by applications in engineering tolerance based discrepancy and penalty terms have been investigated (Type I). These tolerance measures take zero values in a certain region around the origin, which allows to incorporate expert knowledge on y and/or on x. In this project we aim at investigating Tikhonov functionals based on three terms, which separately measures data misfit, tolerance based fidelity in the data term.
R2-5: Convergence to Preferable Minima in Real-Time Nonlinear Optimization Problems
The optimisation of dynamical control systems leads to challenging parameter identification (PI) problems for the control signals. This is due to a number of reasons: (1) typically, PI problems are highly non-linear in the constraints, such that there exists many local optima, (2) control signals are infinite-dimensional parameters which have to be approximated adequately by numerical PI methods, (3) oftentimes, e.g. in autonomous driving applications, real-time constraints apply. Thus, finding global optimal solutions is virtually impossible; instead, one is interested in finding a preferable local optimum.
R2-6: Deep Learning based Lane Segmentation and Model Predictive Lane Following
To achieve autonomous driving in complex urban contexts environment perception is key. By means of computer vision the driveable area must be identified, other traffic participants detected, classified and their actions anticipated and based on this, control values for the car calculated. Due to the recent advances in both powerful hardware and algorithmic design, especially in the field of machine learning, the aim is to develope a neural network that is both state-of-the-art in segmentation accuracy and real time capable. On top of that, since segmentation masks require costly post processing to handle identification of object instances such as persons or vehicles, the network will be trained not only to perform segmentation, but also make predictions similar to a generic object detector.
R2-7: Parameter and Structure Identification for Complex Dynamical Systems
This project addresses joint identification problems for both the structure and the parameters of a model for dynamic processes or systems. Complex dynamical systems often consist of several subsystems which belong to different physical fields (multiphysical systems). The subsystems come along with different established modelling approaches, e.g. multi-body systems in mechanics, circuit models in electronics, and CFD models in fluid dynamics. However, for every submodel, the accuracy, i.e. order, dimensions, or time-scale, has to be chosen. Thus, there does not exist one single model-structure, for which parameter identification (PI) has to be performed. Instead, PI has to be combined with structure identification (SI). Moreover, real-world technical systems also possess interfaces or subsystems that can only be modelled by generic black-box models, i.e. look-up-tables in the simplest case. While PI problems can be addressed by efficient nonlinear optimisation solvers, SI problems might lead to large combinatorial problems and finding the optimal structure is a machine learning task. Within this PhD project, suitable methods for large-scale nonlinear optimisation shall be combined with machine learning techniques, such that SI and PI problems cannot only be solved subsequently, but also in a combined problem setting.
R3-1: Application of Δ-complexes in topological data analysis
The basic idea of topological data analysis is to first construct a topological complex based on some given data which can then be analysed using common tools of algebraic topology, one of the most prominent examples being the persistent homology. For this project, the first of these steps of topological data analysis is of interest, more specifically the type of complex that is usually constructed. Most of the time these complexes are simplicial complexes, for example an Alpha-complex or a Witness-complex, which are easy enough to work with. Now, the starting point of this project will be to consider a different kind of complex. A Δ-complex is constructed by gluing together a collection of simplices such that certain conditions are met. The difference to a simplicial complex is that these conditions allow for more possible complexes while also including any possible simplicial complex which makes Δ-complexes a generalization of said complexes. The simplicial homology that is needed to analyse the simplicial complexes in the topological data analysis is also defined on Δ-complexes which ensures that the second step of the analysis could be carried on as before. The question that arises out of this train of thought is what application these types of complexes could have and if they have any advantages over or disadvantages compared with the commonly used simplicial complexes.
R3-7: Post-selection Inference for Prediction Performance
In many cases, statistical hypotheses testing and interval estimation of prediction performance measures for statistical prediction models are required or at least desirable in order to overcome the overestimation of a model’s ability to predict the outcome of future observations. The aim of this research project is to develop methodology for an exact or at least asymptotically valid post-selection inference for prediction performance measures. Usually, the validation of the prediction performance of the predictor is deferred to a separate and independent validation study, which is both costly and time consuming. This admits of improvement in terms of simplicity and the validity of the study in which the predictor has been developed. Statistical methods and tools for a (asymptotically) valid post-selection inference of model prediction performance could accomplish this kind of improvement.
R3-8: Two-sample Testing for the Comparison of MALDI Spectra
Previous results indicate that there exist strong dependencies among peak data in (averaged) MALDI spectra. The challenge that we want to tackle in the project is to take into account or even exploit these dependencies in the design of multiple test procedures for the comparison of two (averaged) MALDI spectra corresponding to, e.g., two different states of a disease or two different experimental conditions. In principle, it is also possible to extend the scope to k-sample problems with k > 2 groups which may correspond, for example, to different disease stages or to different dosages of a drug.
R3-9: Discrete Morse Theory in the Topological Data Analysis
The field of topological data analysis tries to characterize data in a topological way. The challenge in this subject is that the data is often incomplete or even sparse, loud, and higher-dimensional. Even so, the data has often lower-dimensional properties which can be detected and described. The project is about these properties and tries to apply methods from topology to simplify the analyses.
R3-10: Learning Tikhonov Functionals for Inverse Problems
We consider classical inverse problems given by a linear or non-linear operator A : X → Y which maps x ∈ X to y ∈ Y. The related inverse problem asks to determine an approximation to x † from given noisy measurements yδ = Ax† + η. This can be achieved by minimizing Tikhonov functionals of type ℒ(x) = ||Ax – yδ||2 + α R(x) (1), where R: X → ℝ is called regularization term and α ∈ ℝ+ is a scalar weighting factor. Traditional approaches use handcrafted regularizers to introduce certain properties to the solution. A common choice is R(x):= ||x ||1 , which forces possible solutions of the minimization problem (1) to be sparse. For academic examples, one can achieve excellent results with these basic handcrafted regularizers. However, in many cases much more advanced methods are needed on real data. This is even clearer when we use the statistical view on inverse problems. If we are interested in the maximum a posteriori estimator, the R in (1) has to be the logarithm of the prior distribution, for example the distribution of natural computed tomography reconstructions. It is unclear how one would design such a prior by hand. A recent approach to overcome this problem is to learn a parameterized version Rϑ directly from data. We choose artificial neural networks as a general model class for this task due to their approximating power and great success in a large variety of applications. The aim of this project is to investigate the choice and influence of different network architectures on the regularizing properties of Rϑ. In addition, the number of data points needed to train such a model is of crucial interest. To keep it as low as possible, novel approaches to combine classical modelling and learning from data will be investigated. Computed tomography (CT) will be used as main application case in this project. CT is a prime example of an ill-posed inverse problems. It is physically well understood, but creating good reconstructions remains challenging in low-dose cases and for a limited number of measurement angles. The (learned) Tikhonov functional approach will be compared against a number of other reconstruction methods, including ones that also feature learning.
R3-11: Causal identification and inference in time-to-event settings
This project is concerned with developing theory and methods for studying causal relations in time-to-event data. Graphical models will provide a framework for causal structures, and in order to handle the asymmetric dependencies of dynamic data, we will focus on local independence graphs. Intervention is a central concept for causal identification and we will formalize the concept of intervention in local independence graphs, which we will use to extend graphical criteria for identification. Furthermore, we will investigate how to use local independence graphs to solve advanced causal identification issues, such as identification of separable effects in the context of mediation analyses and identification of suitable causal parameters in multistate models. Moreover, we will develop method for applications to medical and epidemiological questions, such as treatment of chronic diseases or competing risks situations in cancer research.