— Lecture 9 —

Symbolic regression and equation learning for identifying sparse non‑linear models

Lecturer: Georg Martius (MPI-IS)
Time: (Zurich time)
Slides: Click here to download!
Recording: Click here to view! (only for ETH members)


In classical machine learning, regression is treated as a black box process of identifying a suitable function without attempting to gain insight into the mechanism connecting inputs and outputs. In the natural sciences, however, finding an interpretable function for a phenomenon is the prime goal as it allows to understand and generalize results. Following the theme of the lecture by Nathan Kutz, we will consider the search for parsimonious models. In this lecture we will consider non-linear models represented by concise analytical expressions. The problem of finding such expressions is generally called symbolic regression. Traditionally, this problem is solved with evolutionary search. In recent years, machine learning methods have been proposed for this task. One of the first works in this direction was the “equation learner” (EQL), which I will cover in some detail. Very recently, a different approach was presented that uses pretraining and language models to predict likely equations (NeSymReS). I will also talk about this method and compare them. An interesting aspect of the search for the most compact description of data is its connection to causal models and the identification of the true underlying relationships. In general, prior knowledge is required to make this work, but if successful, we can enjoy great generalization capabilities.

Recommended reading: