OT-SC-WS-04 | Evaluating machine learning and artificial intelligence algorithms

Prof. Dr. Werner Brannath, Dr. Max Westphal

Artificial Intelligence (AI) and Machine Learning Methods (ML) have been successfully applied in many areas with some very prominent examples like e.g. the AlphaGo algorithm. However, there are also examples for ML/AI-algorithms that perform quite poor (e.g. IBM Watson). When building an ML/AI-algorithm a large number of prediction or classification models are explored with the goal of selecting the seemingly best one. This can lead to a severe overestimation of the prediction or classification ability (also called “performance”). The algorithm also depends on the data used for its determination. A weak performance of an ML/AI-algorithm is often difficult to identify but can produce severe harm when the algorithm is applied.

One therefore needs to evaluate ML/AI-algorithms and must apply specific methods and techniques (e.g. based on cross-validation or bootstrapping) to avoid overestimation. One also needs to carefully distinguish between the performance of the model building process itself (unconditional performance) and the prediction/classification ability of the finally selected algorithm (conditional performance). In this course we will illustrate the difficulties and challenges with the judgment of ML/AI performance and introduce a number of techniques for a reliable estimation of unconditional and conditional performance. The course content is a requirement for those who aim to build an ML/AI- algorithm and helpful to those who want to apply an existing one.

Contents

Motivation: why do we need (quantitative) method evaluation in ML/AI?
Definition of performance measures of ML/AI solutions, primarily for supervised methods (classification, regression), but also unsupervised methods (e.g. clustering)
Statistical inference for selected performance measures (estimation, statistical testing, confidence intervals)
Important terminology and concepts (in-sample vs. out-of-sample performance, conditional vs. unconditional performance)
Practical aspects (experimental design, study planning)
Application of evaluation methods to case-studies in R

Outcomes

Understanding of the challenges and difficulties with the evaluation of ML/AI-algorithms
Understanding and knowing how to apply basic and more advanced methods for ML/AI-algorithm evaluation

Prior knowledge

Basic statistical knowledge (e.g. Statistical Basics, Quantitative analysis)
Basic machine learning skills (e.g. Machine learning algorithms, Deep learning/neural networks)

Requirements

Own PC, laptop
For online format a second screen might be beneficial

Japkowicz, Nathalie, and Mohak Shah. Evaluating learning algorithms: a classification perspective. Cambridge University Press, 2011
Kuhn, Max, and Kjell Johnson. Applied predictive modeling. Vol. 26. New York: Springer, 2013.
Raschka, Sebastian. "Model evaluation, model selection, and algorithm selection in machine learning." preprint arXiv:1811.12808 (2018).

When?

01.11.2021, 09:00 - 17:00

03.11.2021, 09:00 - 17:00

05.11.2021, 09:00 - 16:30

Where?

Online via VC

Language?

English

Professor of Applied Statistics and Biometry at faculty of Mathematics and Computer Science at the University of Bremen
Director of the Group Biometry at the Competence Center for Clinical Trials Bremen (KKSB)

Post-doctoral researcher - Data Science & Biostatistics - at Fraunhofer MEVIS

Updated by: Tanja Hörner

RSS

Print page

OT-SC-WS-04 | Evaluating machine learning and artificial intelligence algorithms

Prof. Dr. Werner Brannath, Dr. Max Westphal

Learning contents & outcomes

Prior knowledge & requirements

Further Reading

Prof. Dr. Werner Brannath

Dr. Max Westphal