Untitled
Untitled
Miguel A. Mendez is Assistant Professor at the von Karman Institute for Fluid Dynam-
ics, Belgium. He has extensively used data-driven methods for post-processing numer-
ical and experimental data in fluid dynamics. He developed a novel multi-resolution
extension of POD which has been extensively used in various flow configurations of
industrial interest. His current interests include data-driven modeling and reinforce-
ment learning.
bli h d li b C b id i i
bli h d li b C b id i i
Data-Driven Fluid Mechanics
Combining First Principles and Machine Learning
Based on a von Karman Institute Lecture Series
Edited by
MIGUEL A. MENDEZ
von Karman Institute for Fluid Dynamics, Belgium
ANDREA IANIRO
Universidad Carlos III de Madrid
BERND R. NOACK
Harbin Institute of Technology, China
STEVEN L. BRUNTON
University of Washington
bli h d li b C b id i i
Shaftesbury Road, Cambridge CB2 8EA, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
103 Penang Road, #05–06/07, Visioncrest Commercial, Singapore 238467
www.cambridge.org
Information on this title: www.cambridge.org/9781108842143
DOI: 10.1017/9781108896214
© Cambridge University Press 2023
This publication is in copyright. Subject to statutory exception and to the provisions
of relevant collective licensing agreements, no reproduction of any part may take
place without the written permission of Cambridge University Press & Assessment.
First published 2023
Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-84214-3 Hardback
Cambridge University Press & Assessment has no responsibility for the persistence
or accuracy of URLs for external or third-party internet websites referred to in this
publication and does not guarantee that any content on such websites is, or will
remain, accurate or appropriate.
bli h d li b C b id i i
Contents
Part I Motivation 1
bli h d li b C b id i i
vi Contents
bli h d li b C b id i i
Contents vii
bli h d li b C b id i i
viii Contents
bli h d li b C b id i i
Contents ix
References 409
bli h d li b C b id i i
Contributors
Mohammad Abu-Zurayk
Institute of Aerodynamics and Flow Technology, German Aerospace Center (DLR),
Germany
Gianmarco Aversano
Université Libre de Bruxelles, Belgium
Philipp Bekemeyer
Institute of Aerodynamics and Flow Technology, German Aerospace Center (DLR),
Germany
Steven L. Brunton
University of Washington, USA
Axel Coussement
Université Libre de Bruxelles, Belgium
Giuseppe D’Alessio
Université Libre de Bruxelles, Belgium
Politecnico di Milano, Italy
Scott Dawson
Illinois Institute of Technology, USA
Stefano Discetti
Universidad Carlos III de Madrid, Spain
Arthur Ehlert
Technische Universität, Berlin, Germany
Industrial Analytics, Berlin, Germany
bli h d li b C b id i i
List of Contributors xi
Daniel Fernex
École Polytechnique Fédérale de Lausanne, Switzerland
Thomas Franz
Institute of Aerodynamics and Flow Technology, German Aerospace Center (DLR),
Germany
Stefan Görtz
Institute of Aerodynamics and Flow Technology, German Aerospace Center (DLR),
Germany
Andrea Ianiro
Universidad Carlos III de Madrid, Spain
Javier Jiménez
Universidad Politécnica Madrid, Spain
Alex Kuhnle
University of Cambridge, United Kingdom
François Lusseyran
CNRS, LISN, Université Paris Saclay, France
Mohammad R. Malik
Université Libre de Bruxelles, Belgium
Miguel A. Mendez
von Karman Institute for Fluid Dynamics, Belgium
Marek Morzyński
Poznań University of Technology, Poland
Christian N. Nayeri,
Technische Universität Berlin, Germany
Bernd R. Noack,
Harbin Institute of Technology, China
Technische Universität Berlin, Germany
Alessandro Parente
Université Libre de Bruxelles, Belgium
Jean Rabault
Norwegian Meteorological Institute, Norway
University of Oslo, Norway
bli h d li b C b id i i
xii List of Contributors
Matteo Ripepi
Institute of Aerodynamics and Flow Technology, German Aerospace Center (DLR),
Germany
Peter J. Schmid
Imperial College London, United Kingdom
Richard Semaan
Technische Universität Braunschweig, Germany
James C. Sutherland
University of Utah, USA
Kamila Zdybal
Université Libre de Bruxelles, Belgium
bli h d li b C b id i i
Preface
This book is for scientists and engineers interested in data-driven and machine learning
methods for fluid mechanics. Big data and machine learning are driving profound
technological progress across nearly every industry, and they are rapidly shaping
research into fluid mechanics. This revolution is driven by the ever-increasing amount
of high-quality data, provided by rapidly improving experimental and numerical
capabilities. Machine learning extracts knowledge from data without the need for first
principles and introduces a new paradigm: use data to discover, rather than validate,
new hypotheses and models. This revolution brings challenges and opportunities.
Data-driven methods are an essential part of the methodological portfolio of fluid
dynamicists, motivating students and practitioners to gather practical knowledge
from a diverse range of disciplines. These fields include computer science, statistics,
optimization, signal processing, pattern recognition, nonlinear dynamics, and control.
Fluid mechanics is historically a big data field and offers a fertile ground to develop
and apply data-driven methods, while also providing valuable shortcuts, constraints,
and interpretations based on its powerful connections to first principles physics.
Thus, hybrid approaches that leverage both data-driven methods and first principles
approaches are the focus of active and exciting research. This book presents an
overview and a pedagogical treatment of some of the data-driven and machine
learning tools that are leading research advancements in model-order reduction,
system identification, flow control, and data-driven turbulence closures.
This book originated from a one-week course at the von Karman Institute for
Fluid Dynamics, Belgium (VKI; www.vki.ac.be/). The course was hosted by the
Université Libre de Bruxelles, Belgium, from February 24 to 28, 2020, in the classic
VKI lecture series format. These are one-week courses on specialized topics, selected
by the VKI faculty and typically organized 8–12 times per year. These courses have
gained worldwide recognition and are among the most influential and distinguished
European teaching forums, where pioneers in fluid mechanics have been training
young talents for many decades.
The lecture series was co-organized by Miguel A. Mendez from the von Karman
Institute, Alessandro Parente from the Université Libre de Bruxelles, Andrea Ianiro
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
xiv Preface
from the Universidad Carlos III de Madrid (Spain), Bernd R. Noack from the Harbin
Institute of Technology, Shenzhen (China), and TU Berlin (Germany), and Steven L.
Brunton from the University of Washington (USA).
Online Material
The Audience
The book is intended for anyone interested in the use of data-driven methods for fluid
mechanics. We believe that the book provides a unique balance between introductory
material, practical hands-on tutorials, and state-of-the-art research. While keeping
the approach pedagogical, the reader is exposed to topics at the frontiers of fluid
mechanics research. Therefore, the book could be used to complement or support
classes on data-driven science, applied mathematics, scientific computing, and fluid
mechanics, as well as to serve as a reference for engineers and scientists working
in these fields. Basic knowledge of data processing, numerical methods, and fluid
mechanics is assumed.
Like the course from which it originates, this book results from the contribution of
many authors. The use of machine learning methods in fluid mechanics is in its early
days, and a large team of lecturers allowed the course attendees to learn from the
expertise and perspectives of leading scientists in different fields.
Here we provide a road map of the book to guide the reader through its structure
and link all the chapters into a coherent narrative. The book chapters can be clustered
into six interconnected parts, slightly adapted from the VKI lecture series.
Part I: Motivation. This part includes the first three chapters, which introduce the
motivation for data-driven techniques from three perspectives.
Chapter 1, by B. R. Noack and coauthors, opens with a tour de force on machine
learning tools for dimensionality reduction and flow control. These techniques are
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Preface xv
introduced to analyze, model, and control the well-known cylinder wake problem,
building confidence and intuition about the challenges and opportunities for machine
learning in fluid mechanics. Chapter 2, by J. Jiménez, takes a step back and gives
both a historical and a data science perspective. Most of the dimensionality reduction
techniques presented in this book have been developed to identify patterns in the data,
known as coherent structures in turbulent flows. But what are coherent structures?
This question is addressed by discussing the relationship between data analysis and
conceptual modeling and the extent to which artificial intelligence can contribute to
these two aspects of the scientific method. Chapter 3, by S. L. Brunton, gives an
overview of how machine learning tools are entering fluid mechanics. This chapter
provides a short introduction to machine learning, its categories (e.g., supervised ver-
sus unsupervised learning), its subfields (regression and classification, dimensionality
reduction and clustering), and the problems in fluid mechanics that can be addressed
by these methods (e.g., feature extraction, turbulence modeling, and flow control).
This chapter contains a broad literature review, highlights the key challenges of the
field, and gives perspectives for the future.
Part II: Methods from Signal Processing. This part brings the reader back
to classic tools from signal processing, usually covered in curricula crossed by
experimental fluid dynamicists, although with a large variety of depth. This part of
the book is motivated by two reasons. First, tools from signal processing are, and
will likely remain, the first “off-the shelf” solutions for many practical problems.
Examples include filtering, time-frequency analysis, and data compression using filter
banks or wavelets, or the use of linear system identification and time series analysis via
autoregressive methods. The second reason – and this is a central theme of the book –
is that much can be gained by combining machine learning tools with methods from
classic signal processing, as later discussed in Chapter 8. Therefore, Chapter 4, by M.
A. Mendez, reviews the theory of linear time-invariant (LTI) systems along with their
properties and the fundamental transforms used in their analysis: the Laplace, Fourier,
and Z transforms. This chapter draws several parallels with more advanced techniques.
For example, the use of the Laplace transform to reduce ordinary differential equations
(ODEs) to algebraic equations parallels the use of Galerkin methods to reduce the
Navier–Stokes equation to a system of ODEs. Similarly, there is a link between
the classical Z-transform and the modern dynamic mode decomposition (DMD).
Chapter 5, by S. Discetti, complements Chapter 4 by focusing on time–frequency
analysis. The fundamental Gabor and continuous/discrete wavelet transforms are
introduced along with the related Heisenberg uncertainty principle and multiresolution
analysis. The methods are illustrated on a time series obtained from hot-wire
anemometry in a turbulent boundary layer and from flow fields obtained via numerical
simulations.
Part III: Data-Driven Decompositions. This part of the book consists of four
chapters dedicated to a cornerstone (and rapidly growing subfield) of fluid mechanics:
modal analysis. This part is mostly concerned with methods for linear dimensionality
reduction, originally introduced to identify, and “objectively” define, coherent struc-
tures in turbulent flows.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
xvi Preface
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Preface xvii
dynamical systems from data, further discussed in Chapter 12. Chapter 12, also by
S. L. Brunton, builds on Chapter 11 and Part III of the book to introduce several
advanced topics in model reduction and system identification. This chapter opens with
a review of balanced model reduction goals for linear systems and builds the required
mathematical background and the fundamentals of balanced POD (BPOD). Linear and
nonlinear identification tools are introduced. Among the linear identification tools,
this chapter presents the eigensystem realization algorithm (ERA) and the observer
Kalman filter identification (OKID). Among the nonlinear identification tools, the
chapter presents the sparse identification of nonlinear dynamics (SINDy) algorithm,
which leverages the LASSO regression from statistics to identify nonlinear systems
from data.
This part closes with Chapter 13, by P. Schmid, providing a modern account
of stability analysis of fluid flows. This chapter begins with a brief review of the
classic definition of stability (e.g., Lyapunov, asymptotic, and exponential stability)
and moves toward a modern formulation of stability as an optimization problem:
unstable modes are those along with which the growth of disturbances is maximized.
This chapter introduces a powerful, adjoint-based, iterative method to solve such an
optimization and shows how to recover common stability and receptivity results from
the general framework. Finally, an illustrative application to the problem of tonal noise
is given.
Part V: Applications. This part of the book is dedicated to the application of data-
driven and machine learning methods to fluid mechanics.
Chapter 14, by B. R. Noack and coworkers, is dedicated to reduced-order modeling.
This chapter gives an overview of the classic POD Galerkin approach, reviewing the
main challenges in closure and stabilization as well as classic applications. It then
moves to emerging cluster-based Markov models and their possible generalization. A
detailed tutorial is also provided to offer the reader hands-on experience with reduced-
order modeling.
Chapter 15, by A. Parente and coworkers, focuses on the use of data-driven
models for studying reacting flows. The numerical simulation of these flows is
extremely challenging because of the vast range of scales involved. This chapter
gives a broad overview of how machine learning techniques can help reduce the
computational burden. The key challenges of high dimensionality are discussed along
with an overview of dimensionality reduction methods, ranging from classic principal
component analysis (PCA) to local PCA, nonnegative matrix factorization (NMF), and
artificial neural network (ANN) autoencoders. The application of these tools to reduce
dimensionality in the modeling of transport and chemical reactions is illustrated in a
challenging test case.
Chapter 16, by S. Görtz and coworkers, is dedicated to the application of reduced-
order modeling for multidisciplinary analysis and design optimization in aerodynam-
ics. The design of an aircraft involves thousands of extremely expensive numerical
simulations. This chapter shows how linear and nonlinear dimensionality reduction
tools can help speed up the process. POD, cluster POD, and Isomaps, combined with
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
xviii Preface
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
A Note on the Notation
The reader will quickly realize that different chapters have (slightly) different notation.
Among these, the same symbol is sometimes used for different purposes, and different
symbols are sometimes used for the same quantities. This choice is deliberate and
motivated by the wide range of intersected disciplines, each of which has well-
established notations. For example, the symbol u usually denotes the actuation in
control theory and the velocity field in fluid mechanics. In reinforcement learning,
the actuation is denoted by at and called the “action” while the sensor measurement
is denoted by st and called the “state” (it is usually denoted by y in control theory).
Resolving these ambiguities would make it difficult for readers to link the material in
this book with the literature of each field.
Moreover, each chapter represents the starting point toward more advanced and
specialized literature, in which a standard notation has not yet been settled. Keeping
the notation as close as possible to the cited literature helps the reader make essential
connections. We hope that the reader will approach each chapter with the required
flexibility, and we welcome comments, corrections, and suggestions to benefit students
for the next reprint.
We give a tour de force through select powerful methods of machine learning for
data analysis, dynamic modeling, model-based control, and model-free control. Focus
is placed on a few Swiss army knife methods that have proven capable of solving
a large variety of flow problems. Examples are proximity maps, manifold learning,
proper orthogonal decomposition, clustering, dynamic modeling, and control theory
methods as contrasted with machine learning control (MLC). In Chapters 14 and 17
of this book, the mentioned machine learning approaches are detailed for reduced-
order modeling and for turbulence control. All methods are applied to a classical,
innocent looking benchmark: the oscillatory two-dimensional incompressible wake
behind a circular cylinder at Re = 100 without and with actuation. This example has
the advantage of being visually accessible to interpretation and foreshadows already
key challenges and opportunities with machine learning.
1.1 Introduction
Analysis may start with the search of a few features helping to categorize or explain
fluid mechanics. In the case of the cylinder wake, the amplitude and phase are two
such features helping to parameterize drag, lift, and even the flow with good accuracy.
Section 1.3 presents two tools for this purpose. First, the proximity map cartographs
all snapshot data in a two-dimensional plane with classical multidimensional scaling
(CMDS) as the prominent approach. Second, an automated manifold extraction, local
linear embedding (LLE), is presented. Both methods allow us to extract the amplitude
and the phase of vortex shedding. CMDS can be applied to arbitrary even turbulent
data. LLE comes with an estimate of the embedding dimension, if the dynamics is
simple enough.
The analysis may continue with the search of a low-dimensional representation of
the flow data – as described in Section 1.4. Two different approaches are presented.
First, the flow data is represented by a data-driven Galerkin expansion with proper
orthogonal decomposition (POD). POD minimizes the averaged error of the expansion
residual. Second, the flow data may also be represented by a small number of
representative state, called centroids. K-means++ clustering achieves this goal by
minimizing the averaged representation error between the snapshots and their closest
centroids.
The dynamics may be understood by reduced-order models (ROM) building on
such low-dimensional representations. The spectrum of ROM has a bewildering
richness with a myriad of enabling auxiliary methods. Hence, an overview is
postponed until Chapter 14. We focus on simple dynamical models of the cylinder
wake, illustrating the analytical insights that may be gained (see Section 1.5).
The stabilization of the wake is discussed in Section 1.6. This discussion starts with
a classical approach employing ROM for the derivation of the control law. A powerful
model-free alternative is MLC, which learns the control laws in hundreds or thousands
of test runs.
Finally, Section 1.7 summarizes some good practices of analysis, modeling, and
control. The chapter cannot be an exhaustive state-of-the-art compendium of machine
learning approaches. Instead, the presented methods can be seen as the Swiss army
knife of machine learning. They are simple yet powerful and can already be applied
in experimental projects with no or limited availability to first principle equations.
On the cylinder, the no-slip condition u = 0 is enforced. At the front x = −10 and
lateral sides of the domain y = ±10 , a uniform oncoming flow u∞ = (1, 0) is assumed.
A vanishing stress condition is employed at the outflow boundary x = 40.
Simulations are performed with a finite-element method on an unstructured grid
with implicit time integration. This solver is third-order accurate in time and second-
order accurate in space. Details about the Navier–Stokes and stability solvers are
described in Morzyński et al. (1999) and Noack et al. (2003). The triangular mesh
consists of 59 112 elements.
The employed initial conditions are based on the unstable steady solution u s and
a small disturbance with the most unstable eigenmode f1 . The steady solution is
computed with a Newton gradient solver. The eigenmode computation is described
in our earlier work (Noack et al. 2003). The disturbance is the real part of the product
of the eigenmode and unit phase factor ejφ . Here, “j” denotes the imaginary unit and φ
the phase. The amplitude ε is chosen to create a perturbation with a fluctuation energy
of 10−4 . The resulting initial condition reads
u(x, t = 0) = u s (x) + εR f1 (x) ejφ . (1.2)
to t = 200, capturing the complete transient and post-transient state. The time step is
∆t = 0.1, corresponding to roughly one 50th of the period.
This domain is about twice as long as the vortex bubble of the steady solution. The
streamwise extent is large enough to resolve over one wavelength of the initial vortex
shedding as characterized by the stability eigenmode. A larger domain is not desirable,
because a small increase in wavenumber during the transient will give rise to large
phase differences in the outflow region, complicating the comparison between flow
states. The domain is consistent with earlier investigations by the authors (Gerhard
et al. 2003, Noack et al. 2003) and similar to the domains of other studies (Deane
et al. 1991).
The analysis is based on the inner product of the Hilbert space of square-integrable
functions over the observation domain Ω. This inner product between two velocity
fields v and w is defined by
∫
(v, w)Ω = d x v · w, (1.4)
Ω
where “·” denotes the Euclidean inner product. The corresponding norm of the
velocity field v reads
p
kvkΩ = (v, v)Ω . (1.5)
The flow u is decomposed into a slowly varying base flow u D and an oscillatory
fluctuation u 0,
u = u D + u 0. (1.6)
The short-term averaged flow is approximated as the projection of the flow on the one-
dimensional affine space containing the steady solution u s and the post-transient mean
flow u0 . The superscript “D” comes from the term distorted mean flow of mean-field
theory (Stuart 1958). In other words,
K/Kmax 0.5
0
0 25 50 75 100 125 150 175 200
t
Figure 1.1 Evolution of the turbulent kinetic energy K with time t associated with an initial
condition for φ = 22.5◦ . The values are normalized with the maximum value Kmax . Red points
indicate normalized fluctuation levels of 0%, 10%, 50%, and 100%. For details, see Ehlert
et al. (2020).
t = 40
t = 60.6
t = 71.5
t = 158.2
Figure 1.2 Vorticity snapshots corresponding to 0%, 10%, 50%, and 100% fluctuation level
for the simulation displayed in Figure 1.1. The flow is visualized by iso-contours of vorticity
with positive (negative) values marked by solid (dashed) lines and red (blue) background. The
iso-contour levels and color scales are the same for all snapshots. For details, see Ehlert et al.
(2020).
to a shorting of the vortex bubble and an upward motion of the fluctuation energy.
The state moves outward and upward on a paraboloid until it converges against a
limit cycle. The center of this limit cycle is characterized by the mean flow while the
fluctuations are well approximated by the first two POD modes. In Chapters 14 and
17, this dynamics will be distilled from the data, dynamically modeled, and reversed
by stabilizing control.
In this section, feature extraction is discussed. For the sake of concreteness, fea-
tures are considered for an ensemble of M velocity snapshots u m (x) = u(x, t m ),
m = 1, . . . , M. For an oscillatory flow, amplitude and frequencies are important
features that can completely or, in case of slow drifts, partially characterize the state.
For a turbulent flow, features are far more challenging to design. In case of skin
friction reduction of a turbulent boundary layer, features might be the position and
amplitude of sweeps and ejections. In the following, two feature extraction methods
are presented. First (Section 1.3.1), proximity maps via classical multidimensional
Figure 1.3 Principal sketch of the wake dynamics. The left side displays the mean flow (top),
shift-mode (middle) and steady solution (bottom). The right side illustrates interpolated vortex
streets on the mean-field paraboloid (middle column). The short-term averaged flows are
depicted also as the streamline plots. Adapted from Morzyński et al. (2007).
scaling (CMDS) can be employed for any data. Second (Section 1.3.2), manifold
extraction with local linear embedding (LLE) is particularly powerful for low-
dimensional dynamics.
γ2
−1
−2
−2 −1 0 1 2
γ1
Figure 1.4 LLE of 16 cylinder wake transients. The figures displays the first two embedding
coordinates γ = [γ1, γ2 ]T resulting from K = 15 nearest neighbors. For details, see Ehlert et al.
(2020).
proximity maps have been proposed. In case of control, the error may include also the
cost function to bring similarly performing states closer together (Kaiser et al. 2017).
K
Õ m
um ≈ wmk u ik
k=1
be the best approximation of the mth snapshot by its neighbors with optimized
nonnegative weights wmk adding up to unity. These constraints on the weights enforce
a local interpolation. Then, also the feature vector can be approximated by the same
m
expansion, γ m ≈ k=1 wmk γ ik . Here, K is a design parameter. It must be larger than
ÍK
the dimension of the manifold yet sufficiently small for the assumed locally linear
behavior of the manifold. N is increased until convergence of the error is reached.
Then, N denotes the dimension of the manifold. For the details, we refer to the original
literature (Roweis & Lawrence 2000).
Figure 1.4 displays the LLE feature coordinates of the wake snapshot data.
The origin corresponds to the steady solution u s , while the outer circle represents
u m 7→ a m := G (u m ) ∈ R N , m = 1, . . . , M (1.10)
a m 7→ û m := H (a m ) , m = 1, . . . , M (1.11)
for the reconstruction of the state. Ideally, the autoencoder identifies the best possible
pair of encoder G and decoder H that minimizes the in-sample error of the estima-
tor/decoder
M
1 Õ m
Ein := k û − u m k 2Ω . (1.12)
M m=1
POD modes, and ai be the corresponding mode coefficients. Then the encoder G of a
velocity field u to the mode amplitudes a = [a1, . . . , a N ]T is defined by
ai := (u − u0, ui )Ω , i = 1, . . . , N, (1.13)
while the decoder H reads
M
Õ
û(x) = u0 (x) + ai ui (x). (1.14)
i=1
POD modes are an orthonormal set and sorted by energy content. The optimality
condition (e.g., Holmes et al. 2012) implies a minimal in-sample representation error
from (1.12). We cannot find another Nth order Galerkin expansion (more precisely, a
different N-dimensional subspace) yielding a smaller error.
For the transient wake data, the most energetic POD modes ui and the amplitude
evolution ai are displayed in Fig. 3 and 4 of Noack et al. (2016). The first two modes
correspond to von Kármán vortex shedding. The third mode resolves the change of
the mean field. The following modes mix different frequencies and wavelengths.
γ2
−1
−2
−2 −1 0 1 2
γ1
Figure 1.5 Cluster centroids localized in the LLE-based feature space. One centroid represents
the steady-state solution; two resolve the opposite transient phases; and the remaining eight
centroids are close to the limit cycle. For details, see Ehlert et al. (2020).
102
Eout
10−1
10−4
40 60 80 100 120 140 160
t
Figure 1.6 Out-of-sample error Eout for a new simulation trajectory at Re = 100. The solid
line corresponds to LLE representations. The red dash-dotted curve and blue dashed curve
refer to approximations with 10 centroids and 10 POD modes, respectively. For details, see
Ehlert et al. (2020).
between t = 60 and t = 80. LLE significantly outperforms both POD and clustering
by up to three orders of magnitudes, highlighting the two-dimensional manifold of
the Navier–Stokes dynamics and a niche application of LLE. As expected, clustering
performs worst lacking any intrinsic interpolation. The low error of the LLE-based
autoencoder demonstrates that the dynamics is effectively two-dimensional. Yet, about
50 POD modes are necessary for a similar resolution as corroborated by Loiseau et al.
(2018) for a similar manifold approximation. Evidently, POD-based representations
are not efficient for slowly changing oscillatory coherent structures.
Data-driven Galerkin expansions have been optimized for a myriad of purposes.
Dynamic mode decomposition (DMD) (Rowley et al. 2009, Schmid 2010) can extract
stability modes from initial transients and Fourier modes from converged post-
transient data. However, the performance for transient wakes is disappointing while
a recursive DMD can keep some advantages of POD and DMD (Noack et al. 2016).
Flexible state-dependent modes may significantly improve the accuracy of a low-order
representation (Siegel et al. 2008, Tadmor et al. 2011, Babaee & Sapsis 2016).
In this section, a path to a least-order model for the transient cylinder wake is pursued.
The starting point is the POD Galerkin method (Section 1.5.1). Then (Section
1.5.2), mean-field theory is employed to significantly simplify the dynamics. The
simplification culminates in a manifold model with the Landau equation as dynamics
(Section 1.5.3).
These conditions may include the incompressibility of the flow, a stationary domain,
stationary Dirichlet, periodic, or von Neumann boundary conditions, and smoothness
of the flow. The coefficients ci , li j , and qi jk are functions of the modes and of the
Reynolds number. The coefficients may also be identified from numerical solutions or
experimental data (Galletti et al. 2004, Cordier et al. 2013), for instance, if the flow
domain is too small or if the turbulent fluctuations are not resolved in (1.14). We refer
to exquisite textbooks (Fletcher 1984, Holmes et al. 2012) for details. Schlegel and
Noack (2015) have derived necessary and sufficient conditions for bounded solutions,
which can be considered a requirement for a physical sound model.
The first harmonics represents von Kármán vortex shedding, which may be resolved
by a cosine u1 and sine u2 mode, ignoring the shape deformation for a moment:
u 0(x, t) = a1 (t) u1 (x) + a2 (t) u2 (x). (1.16)
The modes may be inferred from Figure 1.3 in the rightmost column. In the following,
u1 , u2 , and u3 are assumed to be orthonormalized. The first three POD modes of the
transient yield are already a good approximation of these modes.
Following mean-field arguments (Noack et al. 2003), the Galerkin system (1.15)
simplifies to a self-amplified, amplitude limited oscillator,
da1 /dt = σa1 − ωa2, σ = σ1 − βa3, (1.17a)
da2 /dt = σa2 + ωa1, ω = ω1 + γa3, (1.17b)
da3 /dt = σ3 a3 + α a1 + a22 .
2
(1.17c)
The oscillator has three parameters σ1 , ω1 , σ3 for linear dynamics and three
parameters for α, β, γ for the weakly nonlinear effects of Reynolds stress. Intriguingly,
sparse identification of nonlinear dynamics (SINDy) extracts precisely this sparse
dynamical system from transient simulation data (Brunton, Proctor & Kutz 2016a).
a
a
a
Figure 1.7 Transient dynamics of the cylinder wake from the DNS (solid line) and the
mean-field Galerkin system (1.17) (dashed line). Here, a∆ = a3 . Phase portrait from Tadmor
and Noack (2004).
Figure 1.7 shows that the mean-field Galerkin system (1.17) and the direct numeri-
cal simulation agree well. A detailed analysis (Tadmor & Noack 2004) quantitatively
corroborates this agreement for the manifold and the temporal evolution.
explaining the mean-field parabola shape in the figure. Equation (1.18) in (1.17a),
(1.17b) and the introduction to polar coordinates a1 = r cos θ, a2 = r sin θ lead to the
famous Landau equation (Landau & Lifshitz 1987) for a supercritical instability,
These equations show an exponential growth by linear instability and a cubic damping
by Reynolds stress and mean-field deformation. The frequency changes as well. The
nonlinearity parameters β 0 and γ 0 can easily be derived from (1.17). Intriguingly, this
equation is found to remain accurate even far from the onset of vortex shedding.
The resulting Landau model does not include higher harmonics. Even worse, we
have assumed fixed modes u1 , u2 . Yet, the prominent stability mode near the steady
solution is distinctly different from the POD modes characterizing the limit cycle. a3 -
dependent modes can cure this shortcoming (Morzyński et al. 2006), but the model is
still blind to higher harmonics. A more accurate model is based on the LLE feature
coordinates γ1 = r cos θ and γ2 = r sin θ and an identified Landau equation for the
dynamics of r and θ. The LLE autoencoder incorporates the mode deformation and
higher harmonics (Ehlert et al. 2020). Loiseau et al. (2018) has derived such a model
based on similar premises. The mean-field Galerkin model can be generalized for two
(and more) frequencies (Luchtenburg et al. 2009).
This section previews two flow control approaches. Section 1.6.1 follows the classical
paradigm of model-based control design, while Section 1.6.2 outlines a model-free
machine-learned control optimization.
y
−2
−2 0 2 4 6 8
x
Figure 1.8 Cylinder wake configuration with volume force actuation and hot-wire sensor.
In this differential algebraic system, a3 is slaved to the fluctuation energy (1.18). Here,
b is the actuation command, for example, the induced acceleration in the circular
region, while the positive gain g quantifies the effect on the dynamics and depends,
for instance, on the size and location of the volume force support. Without loss of
generality, the actuation term only effects a1 by suitable rotation of the modes u1 , u2 .
The forced growth rate of the fluctuation energy K = r 2 /2 = (a12 + a22 )/2 reads
dK da1 da2
= a1 + a2 = σr 2 + g b a1 . (1.21)
dt dt dt
The fluctuation energy can be mitigated with negative actuation power g b a1 , that is,
b has to have the opposite sign of a1 . For simplicity, a linear control law is assumed,
The control gain k shall ensure a forced exponential decay rate σc < 0 of the
amplitude r. This implies with (1.22) and (1.21),
dK
= σc r 2 = σr 2 + g k a12 . (1.23)
dt
Averaging over one period and exploiting a12 = r 2 /2 yields the control gain k and thus
the control law
σc − σ
b=2 a1 . (1.24)
g
The implications of this law are plausible: the higher the unforced growth rate and
the higher the desired damping, the larger the volume force amplitude must become.
Contrary, the larger the gain g of actuation on the dynamics, the smaller the volume
force needs to be. It should be born in mind that σ is r dependent.
For the sensor-based control, the state a = [a1, a2 ]T is estimated from the sensor
signal with a dynamic observer. The resulting stabilization effect is shown in Figure
1.9. While the model allows complete stabilization, the fluctuation energy of the
DNS has been reduced by 30% in the observation region x < 6. The model is only
partially accurate as it ignores shedding mode changes due to actuation and convection
(a) (b)
Figure 1.9 Model-based cylinder wake stabilization. (a) unforced and (b) controlled flow.
Here, the vorticity field is shown: red and blue mark negative and positive vorticity, while
green corresponds to potential flow.
effects. The maximum reduction of the fluctuation level is 60%, leading to a complete
stabilization of the near wake and a residual fluctuation in the far wake.
(a) (b)
Figure 1.10 Machine learning control of the fluidic pinball with 3 cylinder rotations
responding to 15 undisplayed downstream sensors. Contour levels of vorticity for the unforced
flow (a) and controlled flow (b) by the control law. Solid lines and dashed lines represent
respectively positive and negative vorticity. For details, see Cornejo Maceda et al. (2019).
Figure 1.10 provides a synopsis for MLC applied to the fluidic pinball configuration
(Cornejo Maceda et al., (2019), (2021)). Forty-two percent reduction of the effective
drag power is achieved accounting for actuation energy. The enabling MIMO feedback
control has 15 downstream sensors (not displayed) and commands the rotation of the
three cylinders. The mechanism is a combination of open-loop Coanda forcing and
closed-loop phasor control (as in the cylinder wake example).
1.7 Conclusions
This chapter and Chapter 19 explore how far the scientific discovery process can
be automated. After discussing the scientific method and its possible relation with
artificial intelligence techniques, this first chapter deals with how to extract correla-
tions and models from data, using examples of how computers have made possible
the identification of structures in shear flows. It is concluded that, besides the data-
processing techniques required for the extraction of useful correlations, the most
important part of the discovery process remains the generation of rules and models.
2.1 Introduction
This chapter is intended to be used together with Chapter 19 of this book. Both deal
with the problem of how much the process of scientific discovery can be automated
and, in this sense, address the core question of the use of artificial intelligence (AI)
in fluid mechanics, or in science in general. The two chapters are, however, very
different. The present one discusses techniques and results that have been current in
fluid mechanics for the past few decades. Specifically in turbulence research. They
owe a lot to the growth of computers, and have changed the field to an extent that
would have been hard to predict 40 years ago. But, during this period, computers were
basically used as research tools, and the way research was done was not very different
from the way it had always been.
Chapter 19 deals with one way in which the role of computers in research could
develop in the near future, and will argue that the scientific method is about to
undergo changes by which computers may evolve from being used as research tools to
being considered minor research “participants.” Since that chapter discusses nascent
technology, closer to AI than what has been used up to now in fluid mechanics, the
scientific results in it will necessarily be more limited than those in the present chapter.
But the methods will be more interesting for being new. In this sense, the present
chapter deals with the past, and Chapter 19 deals with a possible future.
We begin by specifying what science and the scientific method means for us in
these two chapters. There are two meanings to the word “science.” The first one is a
* This work was supported by the European Research Council under the Coturb grant
ERC-2014.AdG-669505.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 21
systematic method for discovering how things work, and the second is the resulting
body of knowledge about how they work. Depending on who you ask, one aspect
is considered to be more important than the other. Empiricists tend to focus on the
scientific method (Poincaré 1920), and practitioners tend to be more interested in the
results (Kuhn 1970). These two chapters deal mostly with methods while illustrating
them with examples from real applications to fluid mechanics.
We take the view that the basic goal of the scientific method is to search for rules
that can be used to make predictions. But the second important goal is to make those
rules “beautiful” for the researcher (Jho 2018). This is not an arbitrary requirement
because, at least for now, science depends on the work of scientists, and people work
better when they like what they are working with. The logarithmic function can be
defined by a computer algorithm to evaluate it, but most mathematicians would be
more likely to use logarithms if they are defined as
d log(t)/dt = 1/t.
The well-known conundrum about why mathematics is useful to the physical sciences,
even when it is primarily developed by mathematicians for its aesthetic value (Wigner
1960), may have more to do with the motivation of mathematicians and physicists
than with mathematics.
The classical point of view is that a prerequisite for science is the determination
of causes (Does A cause B?), preferably complemented by mechanisms (How does
A cause B?). Causality without mechanisms is unlikely to lead to quantitative
predictions, and mechanisms without causality have a high probability of being wrong.
Both causes and mechanisms have to be testable, preferably on data sets different from
the ones used to train them. The ancient Greeks knew about the alternation of night
and day, and imagined causes and mechanisms involving gods and chariots. They
explained the facts known to them, but the Greeks did not have data from other planets,
or even from other latitudes, and we know today that their mechanisms are hard to
generalize. One possible characterization of these two chapters is that they explore
how to determine causality in physical systems, and whether recent developments
provide us with new tools for doing so.
There are several distinctions that need to be made before we begin our argument.
We have mentioned the importance of causes, but research is not always geared toward
them. The classical description of the scientific method is summarized in Figure 2.1,
which emphasized its iterative character. Causality is encapsulated in the modeling
and testing steps, S2 and S3, and the emphasis is often put on these two steps rather
than on the data-gathering one, S1. In fact, the loop in Figure 2.1 often starts with step
S2, and only later are hypotheses tested against observations, as in S2–S1–S3. The
implied relation is not always causal. Even in data-driven research, where observations
precede hypotheses, the argument is often that “if A precedes B,” A is likely to be the
cause of B, although it is generally understood that correlation and causation are not
equivalent. A classical example is the observation of night and day mentioned earlier.
The correlation between earlier days and later nights is perfect, but it does not imply
causation (Mackie 1974).
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
22 Data-Driven Fluid Mechanics Chapter 2
Moreover, the concept of causality is not without problems, and it has been argued
that it is indistinguishable from initial conditions in systems described by differential
equations, such as fluid mechanics (Russell 1912). While this is true, and we could
formulate science as a quest to classify initial conditions in terms of their outcome, the
result may not be very informative, especially in turbulence and other chaotic systems,
which lose memory of their previous evolution after a while. We will restrict ourselves
here to a prescribed-time definition of cause, of the type of: “The falling tree causes
the crashing noise,” even if we know that the fall of the tree has its own reasons for
happening, and that intermediate causes depend on the time horizon that we impose
on them. Such investigations require something beyond correlations, and imply an
active intervention of the observer in the generation of data. This is particularly true
when we are interested in the possibility of controlling our system, instead of simply
describing it. Consider the difference between formulating sub-grid models for large-
eddy simulations, and devising strategies to decrease drag in a pipe.
Simplifying a lot, the difference between the two points of view, which mostly
affects how and when step S1 in Figure 2.1 is undertaken, can be summarized as
follows:
The present chapter is oriented toward predictions, and therefore toward establish-
ing correlations among observations, without being especially interested in causality.
Chapter 19 is oriented toward causality and control, and looks very different from the
present one. In the aforementioned example, Chapter 19 will want to know about the
falling tree because it might be interested in doing something to prevent it from falling.
The present chapter will center (although not completely) on establishing correlations
between tree health and noise levels.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 23
There is a second distinction that, although related to the previous one, is indepen-
dent from it. It has do with how AI, or intelligence in general, is supposed to work
(Nilsson 1998), and is traditionally divided into symbolic (Newell & Simon 1976) and
sub-symbolic (Brooks 1990). Symbolic AI is the classical kind, which manipulates
symbols representing real-world variables according to rules set by the programmer,
typically embodied into an “expert system.” For example, assuming that we agree
that “vortices are localized pressure minima” (or any favorite personal definition),
symbolic AI encapsulates this knowledge into a set of rules to identify and isolate
the vortices. On the contrary, sub-symbolic AI is not interested in rules, but in
algorithms to do things. For example, given enough snapshots of a flow in which
vortices have been identified (maybe manually by the researcher), we can train a
neural network to distinguish them from vortex sheets or from other flow features.
The interesting result of sub-symbolic AI is not the rule, but the algorithm, and it
does not imply that a rule exists. After training our system, we may not know (or
care) what a vortex is, or what distinguishes it from a vorticity sheet, but we may
have a faster way of distinguishing one from the other than would have been possible
using only preordained physics-based rules. Copernicus and the Greeks had symbolic
representations of the day-night cycle, although with very different ideas of the rules
involved. Most other living beings, which can distinguish night from day and usually
predict quite accurately dawn and dusk, are (probably) sub-symbolic. The classical
scientific method, including causality, is firmly symbolic: we are not only interested
in the result, but also in the rule. But there is a small but growing body of scientists and
engineers who feel that data and a properly trained algorithm are all the information
required about Nature, and that no further rule is necessary (Coveney et al. 2016, Succi
& Coveney 2018). Observational science is sub-symbolic at heart, although we will
argue that this point of view is incomplete even for it. Interventional science at least
aims for symbolism.
In fluid mechanics, the availability of enough data to even consider sub-symbolic
science is a recent development, made possible by the numerical simulations of the
1990s, which gave us for the first time the feeling that “we knew everything,” and
that any question that could be posed to a computer would eventually be answered
(Brenner et al. 2019, Jiménez 2020a). The present chapter is partly a review of this
work. Of course, even without considering practical problems of cost, this did not
mean that all questions were answered, because they first had to be put to the computer
by a researcher. But this discussion has to be postponed to Chapter 19.
Some readers could be excused for wondering whether these two chapters,
and especially the present one, deal with AI at all. They may be right. Artificial
intelligence, as the term is mostly used at the moment, is a collection of techniques
for the analysis of large quantities of data. In terms of the scientific method, this
corresponds to the generation of empirical knowledge, usually from observations, but
this is only a small part of the overall method (distributed among steps S1 and S2 in
Figure 2.1). What interests us here is the full process of data exploitation, from their
generation to their final incorporation into hypotheses. It will become clear that AI is
only occasionally useful in the process, even if computers have become indispensable
at most stages of it.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
24 Data-Driven Fluid Mechanics Chapter 2
The rest of this chapter is structured as follows. Section 2.2 discusses the concept
of coherent structure, including, in Section 2.2.1, examples from free-shear flows and,
in Section 2.2.2 and Section 2.2.3, examples from wall-bounded flows. Section 2.3
summarizes the results and discusses the relation between data analysis and conceptual
modeling.
Although turbulence is often treated as a random process in which questions are posed
in terms of statistics, a competing point of view that describes it in terms of structures
and eddies has had its followers from the beginning. Thus, while Reynolds (1894)
centered on the statistics of the fluctuations, Richardson (1920) described them in
terms of “little and big whorls.” And while Kolmogorov (1941) framed his cascade
theory as a statistical relation between fluctuation intensity and scale, Obukhov (1941)
interpreted it in terms of interactions among eddies. There is probably a reason for
the different emphases. Reynolds was an engineer and Kolmogorov a mathematician.
Both could disregard individual fluctuations, which they saw from the outside.
Richardson and Obukhov were primarily atmospheric scientists, and atmospheric
turbulent structures cannot be ignored, because they are large enough for us to live
within them.
These notes take the view that randomness is an admission of ignorance that should
be avoided whenever possible (Voltaire 1764), and that turbulence is a deterministic
dynamical system satisfying the Navier–Stokes equations. Specifically, we will be
interested in whether the description of the flow can be simplified by decomposing
it into “coherent” structures that can be extracted by observation or predicted from
theoretical considerations, and whether the dynamics of these structures can be used
to model the evolution of the flow, or even to guide us into ways of controlling it.
Up to fairly recently, the statistical point of view was all but inevitable, because it is
difficult to extract structural information from single-point measurements. It was not
until shear-flow visualizations in the 1970s revealed structures whose lives were long
enough to visually influence scalar tracers (Kline et al. 1967, Corino & Brodkey 1969,
Brown & Roshko 1974) that the structural point of view began to gain acceptance. And
it was only after probe rakes, numerical simulations, and PIV experiments routinely
provided multidimensional flow fields that that point of view became fully established.
In the same way, the more recent introduction of time-resolved three-dimensional flow
information, mostly from simulations (Perlman et al. 2007, Lozano-Durán & Jiménez
2014, Cardesa et al. 2017) has opened a window into the dynamics of those structures,
not only allowing us to conceive structures as predictable, but forcing us to try to
predict them.
However, any attempt to simplify complexity should be treated with caution,
because it implies neglecting something that may be important. There are several
ways of approaching these simplifications, some of which are described in detail in
other parts of this course. Many are based on projecting the equations of motion onto
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 25
Figure 2.2 The evolution of the coherent structures is expected to depend mostly on a few
other structures, plus perhaps nonessential residues.
a smaller set of variables, typically a linear subspace or a few Fourier modes, but this
approach becomes less justifiable as the Reynolds number increases, or as the system
becomes more extended. If we consider a pipe at even moderate Reynolds number,
whose length is several hundred times its diameter, we can expect to find several
thousands eddies of any definition, with widely varying scales, far enough from each
other to be essentially independent. Projection methods become less useful in those
cases because they do not respect locality, and treat together unrelated quantities.
Moreover, the Navier–Stokes equations are partial differential equations in physical
space (except for the pressure in incompressible flows), but not in Fourier space, and
our approach will be to treat the evolution of the flow as largely local, and to look for
solutions that evolve relatively independently from other similar solutions far away.
This view of turbulence is based on the “hope” that at least part of its dynamics can
be described in terms of a relatively small number of elementary objects that, at least
for some time, depend predominantly on a “few” similar objects and on themselves,
with only minor contributions from an ‘unstructured’ background. We will refer to
them as “coherent structures.” The underlying model is sketched in Figure 2.2, but it
is important to understand that this simple dependence, and therefore the existence of
coherent structures, is mostly a hope that requires testing at every step of the process.
“Self” dependence suggests properties that the structures should possess to be
relevant and practical, such as that they should be strong enough with respect to
their surroundings to have some dynamics of their own, and that they should be
observable, predictable, or computable. These requirements generally imply that
coherent structures are either “engines” that extract energy from some relevant forcing,
sinks that dissipate it, or “repositories” that hold energy long enough to be important
for the general budget of the flow.
The emphasis on structure does not exclude statistics. The descriptions of Reynolds
(1894) and Kolmogorov (1941) have been more useful in applications than most
structural approaches, and it cannot be forgotten that turbulence is as important
for engineering as it is for science. Moreover, statistics are required even when
analysing structure. Turbulent flows are chaotic and high-dimensional (Tennekes &
Lumley 1972, Keefe et al. 1992) and, even if dissipation probably restricts them to a
finite-dimensional attractor in phase space, they explore that space widely. In essence,
anything that is not strictly inconsistent with the equations of motion is bound to
occasionally happen in a turbulent flow, and it is important to approach structure
identification statistically, and to make sure that any observed phenomenon, even
if intellectually appealing, occurs often enough to influence the overall dynamics.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
26 Data-Driven Fluid Mechanics PETER FREYMUTH Chapter 2
(a)
(b)
0.011s s
(c)
d
d
Figure 2.3 (a) Shadowgraph of a turbulent-free mixing layer, from Brown and Roshko (1974).
Flow is from left to right. (b) Transition in a laminar mixing layer, from Freymuth (1966).
(c) Fluctuation amplitude in a forced turbulent mixing layer. Lines are linear stability theory,
and symbols are laboratory measurements. From Gaster et al. (1985). All reproduced with
permission.
Plate 7 Journal of Flicid JIechanics, 1'01. 2 5 , part 4
On the other hand, rare events can be important if they can be exploited for control
purposes (Kawahara 2005), or if they are harmful enough to justify investing
in avoiding them, for example, tornadoes. Turbulence, as an example of natural
macroscopic chaos, may be one of the first systems in which we may hope for the
engineering implementation of a “Maxwell daemon,” in the sense of extracting useful
work from apparently random fluctuations. In a way, soaring albatrosses, or glider
pilots riding thermals, have been doing this for a long time.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 27
Figure 2.4 Automatic tracking of structures in a free-shear layer, from Hernán and Jiménez
(1982), reproduced with permission.
FIGURE7. A part of the final (2,t)-diagram showing several amalgamations; note the tripling
on the left of the picture.
committed during the segmentation step. There, some segments of the layer are left
theory by Freymuth (1966)
unclassified (Figure
on account 2.3(b)).
of the difficulty The their
in recognizing mean velocity
shapes. These segments profile of free-shear
leave gaps in the sequence of eddies representing one frame, and it happens occasion-
flows, such as shearally layers, wakes and jets, are modally unstable, and the associated
that a history can be continued by assuming the existence of an eddy
appropriate size in such a gap. These occasions are especially easy to recognize when
of the
Kelvin–Helmholtz be instability
there is proper
are eddies a t the wellpositions
understood. Inbefore
in the frames just fact,and even
after the if
interpolated, and when moreover the size and position of the gap are such as to
onethe
to stability analysis
is linear, while turbulence is not,
permit the interpolation. The it was
algorithm soon
used confirmed
in our analysis that
inspects sequences
consecutive frames, allowing the interpolation of up to two consecutive gaps. A part
of the
four linearized results
match the large-scale dynamics of forced shear layers well (Gaster et al. 1985),
of the (x, t)-diagram drawn in this way is shown in figure 7.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
28 Data-Driven Fluid Mechanics Chapter 2
Figure 2.5 Near-wall structure of a turbulent channel. The flow is from left to right and the
figure only includes the region below y + ≈ 60, looking toward the wall. The gray background
is the shear at the wall and represents the streamwise-velocity streaks, and the colored shorter
objects are vortices with either positive or negative streamwise vorticity. The axes labels are in
viscous units. From Jiménez (2002), reproduced with permission of the Licensor through
PLSclear.
(a) (b)
Figure 2.6 Near-wall structure of a turbulent channel. The flow is from left to right, and the
figure represents the region below y + ≈ 80, looking toward the wall. The gray objects are the
streamwise-velocity streak, and the colored shorter objects are the vortices responsible for the
sweeps and ejections, with either positive or negative streamwise vorticity. The axes labels are
in viscous units. (a) Minimal channel at Reτ = 180. (b) A weakly oscillatory wave in a
minimal autonomous channel, from Jiménez and Simens (2001), reproduced with permission.
grow by considerable amounts (Butler & Farrell 1993) that the question of wall-
bounded turbulent structures could be treated with some rigor.
Visualizations show that the region near the wall of boundary layers, pipes, and
channels is dominated by long “streaks” (Kline et al. 1967), eventually shown to
be jets of high or low streamwise velocity, sprinkled with shorter fluid eruptions, or
“bursts” (Kim et al. 1971). It was initially hypothesized that bursts were due to the
intermittent breakup of the near-wall streaks, but even the original authors acknowl-
edged that their visualizations could be consistent with more permanent objects being
advected past the observation window. The term “burst” eventually became associated
with the “sweeps” and “ejections” observed by stationary velocity probes, which
respectively move fluid toward and away from the wall (Lu & Willmarth 1973). After
the careful visual analysis by Robinson (1991) of a temporally resolved film of one of
the first numerical simulations of a turbulent boundary layer (Spalart 1988), it became
clear that, at least in the viscous layer near the wall, the sweeps and ejections known
from single-point measurements reflected the passing of shorter quasi-streamwise
vortices, intermittent in space but not necessarily in time (see Figure 2.5). It was also
soon understood that ejections and sweeps create the streaks by deforming the mean
velocity profile, but the origin and dynamics of the bursts remained unclear.
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 29
The next step was taken by Jiménez and Moin (1991), who shrank the dimensions
of a channel simulation to the minimum value that could accommodate a single
sweep–ejection pair and a short streak segment, thus allowing the study of their
interactions “in isolation.” These simulations showed that wall turbulence is
largely independent of the chaotic interaction among structures in the flow, thus
satisfying one of the conditions for coherence in Figure 2.2, and that this minimal
unit was unambiguously intermittent in time. These and other authors eventually
converged to the description of a “cogeneration” cycle in which the vortices
sustain the streak, and an undetermined instability of the streak creates the vortices
(Jiménez & Moin 1991, Hamilton et al. 1995, Jiménez & Pinelli 1999, Schoppa &
Hussain 2002).
A parallel, and largely independent, development was led by Nagata (1990), who
studied wall turbulence from the point of view of dynamical systems, and was able
to compute fully nonlinear steady-state solution of the Navier–Stokes equations in
Couette flow. These solutions are typically unstable, and therefore unlikely to be
found in a real flow, but the idea was that the system evolves relatively slowly in
their phase-space neighborhood, or in that of related oscillatory solutions, and that the
relatively large fraction of time spent near them would “anchor” the flow statistics.
In fact, these simple solutions look extraordinarily similar to the bursting solutions in
minimal channels (Figure 2.6), and predict fairly well the amplitude and dimensions of
the observations in the viscous layer of real flows (Jiménez et al. 2005). They remain
to this day one of the main reasons to believe that coherent structures are important
components in the dynamics of wall-bounded turbulence. A review of their general
significance is Kawahara et al. (2012).
The reason why Figures 2.5 and 2.6 can be compared visually is that they are
restricted to the viscous layer of the flow, where the internal Reynolds number of the
structures, ∆+y = uτ ∆y /ν, is low, where uτ is the friction velocity, ν is the kinematic
viscosity, and ∆y is the height of the structure. The resulting structures are relatively
smooth and can be recognized as “objects.” As the Reynolds number of the flow
increases and some of the structures become larger, their internal Reynolds number
grows, and their shape can best be described as a fractal (see Figure 2.7(a)). Such
objects should be treated statistically, and their study had to wait until the Reynolds
number of the simulations grew and methods of analysis for the large data sets
involved were developed.
The first such studies examined the structure of vorticity in channels. Individual
intense vortices had been studied in isotropic turbulence (Vincent & Meneguzzi 1991,
Jiménez et al. 1993), where they are associated with dissipation. They were known
to be predominantly concentrated in low-velocity regions in channels (Tanahashi
et al. 2004), but del Álamo et al. (2006) showed that the relevant objects in wall-
bounded turbulence were not individual vortices but vortex tangles, or “clusters”
(Figure 2.7(b)). They are approximately collocated with the ejections, which explains
their association with low-velocity regions, and form a self-similar family of clusters
(not of individual vortices) across the logarithmic layer. The minimal-channel tech-
nique was extended by Flores and Jiménez (2010) to the logarithmic layer, where
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
30 Data-Driven Fluid Mechanics Chapter 2
(a)
(b)
Figure 2.7 (a) Reynolds-stress structures in the outer layer of a turbulent channel, Reτ = 2000
(Hoyas & Jiménez 2006). The flow is from left to right. The gray objects are the ejections, and
approximately correspond to low-velocity streaks. The yellow ones are sweeps. Image courtesy
of A. Lozano-Durán. (b) Temporal evolution of a vortex cluster along its lifetime. Channel at
Reτ = 4200. The maximum height of the cluster is uτ ∆y /ν ≈ 300 at which time the cluster is
attached to the wall. From Lozano-Durán and Jiménez (2014), reproduced with permission.
they used it to show that the vortex clusters also burst intermittently, and that this
process is associated with the deformation of the large-scale streaks found in that
region. Probably related to these clusters are the self-similar “hairpin packets” that
have been intensively studied by experimentalists (Adrian 2007), although there is
little information about their temporal behavior.
The next step was to analyze whether a similar organization exists for sweeps and
ejections, which had been studied for a long time as being responsible for carrying
the tangential Reynolds stress. This was done by Lozano-Durán et al. (2012), using
three-dimensional data from simulations. The structures were defined by thresholding
the Reynolds stress, and proved to also form self-similar families of side-by-side
pairs of a sweep and an ejection, which are the logarithmic-layer equivalent of the
quasi-streamwise vortices of the buffer layer. A typical flow field in which these
Reynolds-stress structures have been isolated by thresholding their intensity is given
in Figure 2.7(a), confirming that they are no longer smooth objects, and that it is
difficult to analyze them individually rather than statistically. The study of large-scale
structures of either velocity or Reynolds stress has since been extended to boundary
layers (Sillero 2014), homogeneous shear flows (Dong et al. 2017), and pipes (Lee &
Sung 2013), with broadly similar results. The elementary unit is a sweep–ejection pair
with a vortex cluster in between, located in the lateral boundary between a high- and a
low-velocity streak. We will use this structure interchangeably with the term “burst.”
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 31
Figure 2.8 Effect of the inhomogeneity of the mean velocity profile on the flow field
conditioned to sweep–ejection pairs. The central opaque S-shaped object is an isosurface of
the magnitude of the conditional perturbation vorticity, at 25% of its maximum. The two
translucent objects are isosurfaces of the conditional perturbation streamwise velocity,
u+ = ±0.6. The axes are scaled with the average of the diagonal sizes of the sweep and the
ejection. (a) Homogeneous shear turbulence. (b) Detached eddy in a channel, Reτ = 950.
(c) As in (b), for an attached eddy. Adapted from Dong et al. (2017), reproduced with
permission.
As temporally resolved data sets became available, individual bursts were tracked
by Lozano-Durán and Jiménez (2014), finally answering the question of whether
ejections are permanent passing objects or transient ones. They have a definite
lifetime, shown in Figure 2.7(b) for a vortex cluster, but this lifetime is longer than
the passing time across a stationary probe, so that the signal that the latter sees is due
to advection. Individual bursts are born and disappear at an approximately constant
distance from the wall but, as they grow toward the middle of their lifetime, some
of them become large enough for their root to become attached to the wall. On
the general principle that the largest eddies carry the Reynolds stresses (Tennekes
& Lumley 1972), these attached eddies are responsible for most of the momentum
transfer in the flow (Townsend 1976), where the definition of “large” is linked to the
Corrsin (1958) scale, which separates the eddies that feel the effect of the shear from
the smaller ones that do not (Jiménez 2018a). The temporal behavior of the large
eddies is also self-similar. The lifetime of an eddy whose maximum height is ∆y is
approximately an eddy turnover, uτ T/∆y ≈ 1.
The confirmation that bursts are transient reopened the question of their origin.
There is persuasive evidence that they are essentially linearizable Orr (1907b)
solutions, which transiently extract energy from the mean flow as they are tilted
forward by the shear (Jiménez 2013, Encinar & Jiménez 2020). In retrospect, this
was to be expected, because the only energy source in the flow is the mean shear,
but it leaves open the question of how the bursts are seeded. Orr bursts are temporary
structures of the wall-normal velocity; they are born weak, leaning backward with
respect to the shear; they intensify as the shear tilts them toward the normal to the
wall, and eventually disappear when the forward tilt becomes too pronounced. In
the process, some of their energy is transferred to the streaks and remains in the
flow. This is a plausible mechanism for turbulent-energy production, but there is
no obvious process to restart the eddies as backward-leaning structures. There is
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
32 Data-Driven Fluid Mechanics Chapter 2
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
Jiménez: Coherent Structures in Turbulence 33
2.3 Discussion
The previous sections have reviewed some of what is known about structures in
shear flows. Except for the earliest visualizations of free-shear layers, all the data
mentioned here have been “big” in relation to the dates in which they were generated
and analyzed: from tens of Mbytes for the first channel simulations in the 1980s, to
hundreds of Tbytes for the current time-resolved data sets. The field has been driven
as much by storage capacity as by computational power or experimental methods, and
the analysis of the data has always involved new techniques that were not available,
or required, before the new data came online. It would appear that the description of
these methods should have been the main thrust of a set of notes on the application
of AI to fluid mechanics, but a rereading of the previous sections shows that this has
not been the case. Most of them are dedicated to the discussion of physical models.
We have noted several times that the quest for structures is driven by the hope that
they simplify the description and control of the flow, and that their acceptance by the
community depends on the acceptance of these simplifications.
Coherent structures are not an obvious fit for fluid mechanics, which is best
described by continuous vector fields. Structures are patterns that we hope are useful
for us, but we should be weary of the evidence that humans are notoriously prone to
finding spurious patterns. Consider constellations in the night sky, or divination from
tea leaves, and note that even those spurious patterns keep being considered relevant
because there is a community that finds them useful.
The Kelvin–Helmholtz rollers of free-shear layers were considered useful from the
start because their mechanism is clear and offers a distinct avenue for the description
and control of the flow. The bursts and streaks of wall-bounded flows are still
considered less definitive because the description of their regeneration cycle remains
incomplete.
If any conclusion can be drawn from the present discussion is that the scientific
cycle in Figure 2.1 should be considered as a whole, that data analysis should include
a parallel development of physical models, and that no single isolated step should be
taken as final. Apparently convincing flow descriptions in terms of a reduced set of
structures or modes should be taken with suspicion unless they are complemented by
plausible mechanisms. In the nomenclature introduced in Section 2.1, sub-symbolic
AI only becomes scientifically conclusive once it becomes symbolic.
Appealing mechanisms should also be carefully examined for evidence that they are
really important for the flow. No single description is likely to span the full dynamics
of turbulence, and we are probably bound to always “make do” with partial results.
But it is important to consider the phenomenon as a whole, and to recognize which
part of it is described by our models. One of the reasons to appreciate the bursting
model in Section 2.2.2 is that it accounts for 60% of the skin friction using only 10%
of the flow volume (Jiménez 2018a). This is appealing, but it is fair to wonder what
happens to the other 40%. One of the most dangerous sentences in scientific research
is: “It is a start.”
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
3 Machine Learning in Fluids:
Pairing Methods with Problems
S. Brunton∗
The modeling, optimization, and control of fluid flows remain a grand challenge
of the modern era, with potentially transformative scientific, technological, and
industrial impact. An improved understanding of complex flow physics may enable
drag reduction, lift increase, mixing enhancement, and noise reduction in domains
as diverse as transportation, energy, security, and medicine. Fluid dynamics is
challenged by strong nonlinearity, high-dimensionality, and multiscale physics; both
modeling and control may be thought of as non-convex optimization tasks. Recent
advances in machine learning and data-driven optimization are revolutionizing how
we approach these traditionally intractable problems. Indeed, machine learning may
be considered as a growing set of techniques in data-driven optimization and applied
regression that are tailored for high-dimensional and nonlinear problems. Thus, the
modeling, optimization, and control of complex fluid systems via machine learning
are complementing mature numerical and experimental techniques that are generating
increasingly large volumes of data.
3.1 Overview
The large range of spatial and temporal scales in a complex fluid flow requires exceed-
ingly high-dimensional measurements and computational discretization to resolve all
relevant features, resulting in vast data sets and time-intensive computations. There
is no indication that the increasing volumes of data generated and analyzed in fluid
mechanics will slow any time soon, and we are decades away from fully resolving the
most complex engineering flows (aircraft, submarines, etc.) fast enough for iterative
optimization and in-time control. Despite the large number of degrees of freedom
required to describe fluid systems, there are often dominant patterns that emerge,
characterized by energetically or dynamically important coherent structures (see
Figure 3.1 and Chapter 2). These coherent structures provide a compact representation
of these flows, and capturing their evolution dynamics has been the focus of intense
research in reduced-order modeling for decades, as described in Chapter 1. Advances
in machine learning has promised a renaissance in the analysis and understanding
of such complex data, extracting patterns in high-dimensional, multimodal data
that are beyond the ability of humans to grasp. As more complete measurements
Figure 3.1 Fluid dynamics are often characterized by complex, multiscale dynamics in space
and time. For example, by increasing the Reynolds number, the flow past a circular cylinder
becomes unsteady and turbulent. Figure adapted from Feynman et al. (2013), reprinted by
permission of Basic Books, an imprint of Hachette Book Group, Inc.
become available for more complicated fluid systems, the data associated with these
investigations will only become increasingly large and unwieldy. Data-driven methods
are central to managing and analyzing this data, and generally, we consider data
methods to include techniques that apply equally well to data from numerical or
experimental systems. In fact, data-driven methods in machine learning provide a
common framework for both experimental and numerical data.
In this chapter, we will provide an overview of the established and emerging
applications of machine learning in fluid dynamics, emphasizing the potentially
transformative impact in modeling, optimization, and control; for more details and
references, see Brenner et al. (2019) and Brunton et al. (2020). We take the perspective
that machine learning is a collection of applied optimization techniques that build
models based on available data, which is abundant in fluid systems. Because of the
data-intensive nature of fluid dynamics, machine learning complements efforts in both
experiments and simulations, which generate incredible amounts of data. In machine
learning, it is just as important to understand when methods fail as when they succeed,
and we will attempt to convey the limitations of current methods, exploring what is
easy and difficult in machine learning. We will also emphasize the need to incorporate
physics into machine learning to promote interpretable and generalizable models.
Finally, we cannot overstate the importance of cross-validation to prevent overfitting.
Fluid flow fields may be naturally viewed as images, as in Figure 3.2. Thus, there is
considerable low-hanging fruit in applying standard techniques from machine learning
for image analysis to flow fields, for example for occlusion inference (Scherl et al.
2020) and super-resolution (Fukami et al. 2018, Erichson et al. 2020). However, fluid
dynamics departs from many of the typical application domains of machine learning,
such as image recognition, in that it is fundamentally governed by physics. Many of the
Figure 3.2 Illustration of how simple it is to add texture to an image using modern neural
networks. Is it possible to add the texture of turbulence to a laminar flow?
goals of fluid dynamics involve actively or passively manipulating these dynamics for
an engineering objective through intervention, which inherently changes the nature
of the data collected; see Chapters 17 through 19. Thus, the dynamic and physical
nature of fluids provides both a challenge and an opportunity for machine learning,
potentially providing deeper theoretical insights into algorithms by applying them to
systems with known physics.
Although fluid data is vast in some dimensions (i.e., fully resolved velocity fields
may easily contain millions or billions of degrees of freedom), it may be relatively
sparse in others (e.g., it may be expensive to sample many geometries and perform
parametric studies). This heterogeneity of the data can limit the available techniques
in machine learning, and it is important to carefully consider whether or not the model
will be used for interpolation within a parameter regime or for extrapolation, which
is generally much more challenging. In addition, many fluid systems are naturally
nonstationary, and even for stationary flows it may be prohibitively expensive to obtain
statistically converged results when behavior is broadband; see Chapter 9. Typically,
engineering goals that involve a working fluid are generally not simple classification
tasks, but often involve more subtle multi-objective optimization, for example to
maximize lift while minimizing drag. Finally, because fluid dynamic systems are
central to transportation, health, and defense systems, it is essential that machine
learning solutions are interpretable, explainable, and generalizable, and it is often
necessary to provide guarantees on performance.
Fluid dynamics challenges often amount to large, high-dimensional, non-convex
optimization problems for which machine learning techniques are becoming
increasingly well adapted. Although the opportunities of machine learning in fluid
mechanics are nearly limitless, we roughly categorize efforts into (1) modeling, (2)
closed-loop control, and (3) engineering optimization. These categories necessarily
share some overlap. The traditional wisdom in fluid dynamics is that modeling is
essential to support the downstream goals of optimization and control. However,
effective optimization and control may bypass modeling. In fact, by manipulating
the flow and exciting transients, we may indeed generate valuable data to enrich our
modeling efforts.
This section will distinguish between kinematic and dynamic flow modeling, which
will be important for later sections.
There are several important aspects of flow modeling that we will investigate here.
Modeling may be roughly categorized into two complementary efforts: dimensionality
reduction and reduced-order modeling. Dimensionality reduction involves extracting
dominant patterns that may be used as reduced coordinates where the fluid is
compactly and efficiently described. Reduced-order modeling describes the evolution
of the flow in time as a parameterized dynamical system, although it may also
involve developing a statistical map from parameters to averaged quantities, such as
drag. Models describe how the flow varies, either in time, with a parameter (e.g.,
for optimization), or with actuation (e.g., for control). Identifying and extracting
relevant flow features is essential to describe and model the flow. In fact, much
of the progress in mathematical physics has revolved around identifying coordinate
transformations that simplify dynamics and capture essential physics, such as the
Fourier transform and modern data-driven modal decompositions such as the proper
orthogonal decomposition (POD); see Chapter 6.
Reduced-order modeling encompasses model reduction, which begins with the
governing equations, and system identification, which begins with data. Model
reduction, such as Galerkin projection of the Navier–Stokes equations onto an
orthogonal basis of POD modes, benefits from a close connection to the governing
equations; however, it is intrusive, requiring human expertise to develop models
from a working simulation. System identification provides a flexible and data-driven
alternative, although often resulting in black box models that lack a deep connection
to the physics; see Chapters 7 and 8 for more details on system identification. Machine
learning constitutes a rapidly growing body of algorithms that may be used for
advanced system identification. A central goal of modeling is to balance efficiency
and accuracy, which are often dueling objectives. When modeling physical systems,
interpretability and generalizability are also critical considerations. Other unique
aspects of data-driven modeling of physical systems include partial prior knowledge
of the governing equations, constraints, and symmetries. Finally, the volume, variety,
and fidelity of data will impact the viability of various machine learning methods.
We will separate the discussion into the use of machine learning for (1) flow feature
extraction to model flow kinematics, and (2) modeling dynamics. Of course, there is
some overlap in these topics. For example, the dynamic mode decomposition (DMD)
and related Koopman operator approaches are simultaneously concerned with obtain-
ing effective coordinates to represent coherent structures (kinematics) and a model
for how these coherent structures evolve (dynamics). Similarly, in the discussion of
closure models for turbulence, there are several approaches, ranging from using super-
resolution to fill in low-resolution flow images (kinematics), to directly modeling the
closure terms as a dynamical system (dynamics). Mathematically, the discussion of
kinematics amounts to obtaining a change of coordinates from some variables or
measurements x to a new set of variables a through a possibly nonlinear function ϕ:
a = ϕ(x). (3.1)
...
Figure 3.3 Illustration of linear versus nonlinear embeddings of fluid flow data for flow past a
circular cylinder. It is possible to describe the dominant evolution in a three-dimensional linear
subspace obtained via POD (Noack et al. 2003). However, in this subspace, the flow actually
evolves on a two-dimensional submanifold, given by a parabolic inertial manifold. In general,
linear subspaces are simpler to obtain, although nonlinear embeddings may yield enhanced
reduction.
Figure 3.4 The success of machine learning in the past decade can largely be attributed to (1)
large-scale labeled training data, such as image databases; (2) advanced computational
hardware, such as GPUs; (3) improved neural network architectures and optimization
algorithms to train them; and (4) industry investment, leading to open source tools. GPU image
used courtesy of NVIDIA, www.nvidia.com. PyTorch, the PyTorch logo, and any related
marks are trademarks of Facebook, Inc.
modern deep learning architectures. There is tremendous potential for data science
and machine learning to revolutionize nearly every aspect of our modern industrial
world, and few fields stand to benefit as clearly or as significantly as the field
of fluid mechanics. Indeed, fluid dynamics is one of the original big data fields,
and many high-performance computing architectures, experimental measurement
techniques, and advanced data processing and visualization algorithms were driven
by the decades of research on fluid systems. However, it is important to recognize
that machine learning will not replace experimental and numerical efforts in fluid
mechanics, but will instead complement them (see Figure 3.5), much as the rise of
computational fluid dynamics (CFD) complemented experimental efforts (Brunton,
Hemati & Taira 2020).
Roughly speaking, there are five major stages in machine learning: (1) asking a
question, defining a hypothesis to be tested, or identifying a function to be modeled;
(2) preparing a data pipeline, either through collecting and curating the training
data (e.g., for supervised and unsupervised learning), or setting up an interactive
environment (e.g., for reinforcement learning); (3) identifying the model architecture
and parameterization; (4) defining a loss function to be minimized; and (5) choosing
an optimization strategy to optimize the parameters of the model to fit the data.
Human intelligence is critical in each of these stages. Although considerable attention
is typically given to the learning architecture, it is often the data collection and
optimization stages that require the most time and resources. It is also important to
note that known physics (e.g., invariances, symmetries, conservation laws, constraints,
etc.) may be incorporated in each of these stages (Battaglia et al. 2018, Loiseau
Tasks aided by
machine learning
Flow
data
Figure 3.5 Schematic overview of the use of machine learning in fluid dynamics.
et al. 2018, Lusch et al. 2018, Champion et al. 2019, Cranmer et al. 2019, Zheng
et al. 2019, Greydanus et al. 2019, Noé et al. 2019, Champion et al. 2020, Zheng
et al. 2019, Cranmer et al. 2020, Finzi et al. 2020, Raissi et al. 2020, Zhong &
Leonard 2020). For example, rotational invariance is often incorporated by enriching
the training data with rotated copies, and translational invariance is often captured
using convolutional neural network (CNN) architectures. Additional physics and
prior knowledge may also be incorporated as extra loss functions or constraints in
the optimization problem (Loiseau et al. 2018, Lusch et al. 2018, Champion et al.
2019, Zheng et al. 2019, Champion et al. 2020). There is also considerable work in
developing physics-informed neural network architectures (Lu et al. 2019, Raissi et al.,
2019, 2020), networks that capture Hamiltonian or Lagrangian dynamics (Cranmer
et al. 2019, 2020, Greydanus et al. 2019, Finzi et al. 2020, Zhong & Leonard
2020), and that uncover coordinate transformations where nonlinear systems become
approximately linear (Takeishi et al. 2017, Yeung et al. 2017, Lusch et al. 2018, Mardt
et al. 2018, Wehmeyer & Noé 2018, Otto & Rowley 2019).
Clustering
Discrete
e.g., K-means,
Spectral
clustering
No Discrete or
Unsupervised Continuous?
Embedding
Continuous e.g., POD/PCA
Autoencoder
Diffusion maps
Model Generative
Training Models (GAN)
Data Partial (self-supervised)
Complex Labeled? Semi- Model or
System (truth) supervised Modify?
Modify Reinforcement
Learning
Discrete Classification
e.g, SVM,
Neural network
Yes Model or Model Discrete or Decision tree, …
Supervised Continuous?
Modify? Regression
Continuous e.g., Linear
Generalized linear
Gaussian process
Optimization
Modify
and
Control
Feedback signal to modify system or control parameters
sion) (Holland 1975, Koza 1992), and neural networks (Goodfellow et al. 2016). In
addition, it is increasingly common to add regularization terms to the optimization
problem to promote certain beneficial problems in the resulting model. For example,
promoting sparsity often results in models that are more interpretable (i.e., fewer
parameters to interpret), and less prone to overfitting (i.e., fewer parameters to
overfit) (Hastie et al. 2009, Brunton, Proctor & Kutz 2016a, Brunton & Kutz 2019).
Including priors and regularizers to embed known physics is a particularly exciting
area of development.
Supervised Learning
Supervised learning assumes the availability of labeled training data, and it is
concerned with learning a function mapping the training data to the label. Many of
the most powerful approaches in machine learning are supervised, as there are many
data sets where we have empirical knowledge of the labels, but where it is challenging
to develop a deterministic algorithm to describe the function between the data and the
labels. Supervised learning solves this fuzzy problem by leveraging data rather than
hand-coded algorithms.
If the labels are discrete, such as categories to classify an image (e.g., dog vs.
cat), then the supervised learning task is a classification. If the labels are continuous,
such as the value of the lift coefficient for a particular airfoil shape, then the task is a
regression. The labels may also correspond to an objective that should be minimized
or maximized, resulting in an optimization or control problem.
Before the rise of deep learning, support vector machines (SVMs) (Schölkopf
& Smola 2002) and random forests (Breiman 2001) dominated classification tasks,
and were industry standard methods. Continuous regression methods, such as linear
regression, logistic regression, and Gaussian process regression are still widespread.
Unsupervised Learning
Unsupervised learning, also known as data mining or pattern extraction, determines
the underlying structure of a data set without labels. Again, if the data is to be grouped
into distinct categories, then the task is clustering, while if the data has a continuous
distribution, the task is an embedding.
There are several popular approaches to clustering. The most common and simple
clustering algorithm is the k-means algorithm, which partitions data instances into k
clusters; an observation will belong to the cluster with the nearest centroid, resulting
in a partition of the data space into Voronoi cells. Spectral clustering (Ng et al. 2002)
is also quite common. Likewise, there are many approaches to learn embeddings. The
singular value decomposition (SVD) and related principal component analysis (PCA)
and proper orthogonal decomposition (POD) are perhaps the most widely used linear
embedding techniques, resulting in an orthogonal low-dimensional basis that captures
most of the variance in a data set. These embeddings have recently been generalized
in the neural network autoencoder, discussed in Section 3.2.2.
about whether or not a given action was beneficial will not occur regularly; for
example, in the game of chess, it is not always obvious exactly which moves led
to a player winning or losing. Thus, only partial information is available to guide the
control strategy, making it closely related to semi-supervised learning, although it is
typically considered a distinct branch of machine learning. Reinforcement learning
has been widely used to learn game strategies, for example in Alpha Go (Mnih
et al. 2015). Increasingly, these approaches are being applied to scientific and robotic
applications (Mnih et al. 2015, Silver et al. 2016).
Figure 3.8 (a) Simple feedforward neural network. (b) Deep autoencoder network. (c)
Convolutional neural network. Reproduced from
https://commons.wikimedia.org/wiki/File:
3_filters_in_a_Convolutional_Neural_Network.gif. (d) Deep residual network. (e)
Recurrent neural network. Reproduced from
https://en.wikipedia.org/wiki/File:Recurrent_neural_network_unfold.svg.
y = f1 ( f2 (· · · ( f n (x; θ n ); · · · ); θ2 ); θ1 ), (3.3)
where θ j describe the network weights for each layer (i.e., the weights of the graph)
that must be learned.
There are several challenges in training these large networks, as there may be
millions or billions of parameters in a modern ANN. Backpropagation essentially uses
the chain rule for the composition of functions to compute the gradient of the objective
function with respect to the weights. These gradient computations make it possible to
optimize parameters with gradient descent; stochastic gradient descent algorithms,
such as ADAM, are typically used because of the large scale of the optimization
problem (Kingma & Ba 2014). Other challenges include vanishing gradients and the
need for good network initializations. These considerations all depend on having an
effective network architecture and loss function, which are meta-challenges. More
details on network training can be found in Goodfellow et al. (2016) and Brunton and
Kutz (2019).
We will now review a number of commonly used architectures. The simple feed-
forward architecture, shown in Figure 3.8(a), is useful for approximating input–output
functions. Figure 3.8(b) shows a deep autoencoder network. Autoencoders generalize
the principal component analysis (PCA) to nonlinear coordinate embeddings; the
connection between PCA and linear autoencoders was first established by Baldi
Underfit
Training error
Test error
Parsimonious
0 Overfit
Model complexity
and Hornik (1989), and the first use of deep nonlinear autoencoders in fluids was
in Milano and Koumoutsakos (2002). Recently, deep residual networks (Szegedy
et al. 2017), shown in Figure 3.8(d), have enabled better optimization performance
by including jump connections in exceedingly deep networks. Figure 3.8(e) depicts a
recurrent neural network (RNN), which, unlike feedforward networks, has feedback
connections from later layers back to earlier layers. The long-short term memory
(LSTM) network (Hochreiter & Schmidhuber 1997) is a particular variant of an RNN
that is useful for dynamical systems and other time-series data, such as audio signals.
Finally, Figure 3.8(c) depicts a CNN, which is a useful architecture for processing data
with spatial correlations, such as images.
3.2.3 Cross-Validation
When discussing machine learning, it is critical to introduce the notion of cross-
validation. Cross-validation plays the same role in machine learning that grid refine-
ment studies play in CFD. Because many modern machine learning architectures have
millions of degrees of freedom, it is easy to overfit the model to the training data.
Thus, it is essential to split the data into training and testing data; the model is trained
using the training data and evaluated on the testing data. Depending on the volume of
available data, it is common to randomly split the data into training and testing data,
and then repeat this process many times to get statistically averaged results. Figure 3.9
shows the idea behind cross-validation, where a model may overfit the training data
unless it is cross-validated against test data.
Pattern extraction is a core strength of machine learning from which fluid dynamics
stands to benefit directly. A tremendous amount of machinery has been developed for
image and audio processing, and it is fortunate that fluid flow measurements, both
spatial and temporal, may readily capitalize on these techniques. Here, we begin by
discussing linear and nonlinear dimensionality reduction techniques, also known as
(a) (b)
U V ϕ(x) ψ(x)
a a
x x x x
Figure 3.10 Illustration of shallow, linear autoencoder (a), versus deep nonlinear autoencoder
(b). In the linear autoencoder, the node activation functions are linear, so that the encoder
matrix U and decoder matrix V are chosen to minimize the loss function kx − VUxk. In the
nonlinear autoencoder, the node activation functions may be nonlinear, and the encoder ϕ and
decoder ψ are chosen to minimize the loss function kx − ψϕ(x)k. In each case, the latent
variable a describes the low-dimensional subspace or manifold on which the data evolves.
Figure modified from Brunton, Noack and Koumoutsakos (2020).
ambient high-dimensional space. When the neural network activation functions are
linear and the encoder and decoder are transposes of one another, the autoencoder can
be shown to be closely related to the standard POD/PCA decomposition. However,
the neural network autoencoder is more general, and with nonlinear activation units
for the nodes, it is possible to represent nonlinear analogues of POD/PCA, potentially
providing a better and more compact coordinate system. Milano and Koumoutsakos
(2002) were the first to apply the nonlinear autoencoder for dimensionality reduction
in fluid systems, building nonlinear embeddings to represent near wall turbulence.
Because of the universal approximation theorem (Hornik et al. 1989), which
states that a sufficiently large neural network can represent an arbitrarily complex
input–output function, deep neural networks will continue be leveraged to obtain
more effective nonlinear coordinates for increasingly complex flows. However, it is
important to note that deep learning requires extremely large volumes of training
data, and the resulting models are typically only good for interpolation and cannot be
trusted for extrapolation beyond the training data. In many modern machine learning
applications, such as image classification, the training data is becoming so vast that it
may be expected that most future classification tasks will fall under interpolation of
the training data. For example, the ImageNet data set in 2012 (Krizhevsky et al. 2012)
contained over 15 million labeled images, which sparked the current movement in
deep learning (LeCun et al. 2015, Goodfellow et al. 2016). We are still far away from
this paradigm in fluid mechanics, with neural network solutions providing targeted
and focused models that do not readily generalize. However, it may be possible in the
coming years and decades to curate large and complete enough fluid databases that
such deep interpolation may be more universally useful.
linear models (Juang & Pappa 1985, Juang 1994, Ljung 1999), which are simple
to analyze and use for control design (Dullerud & Paganini 2000, Skogestad &
Postlethwaite 2005). Despite the powerful tools for linear model reduction and control,
the assumption of linearity is often overly restrictive for real-world flows. Turbulent
fluctuations are inherently nonlinear, and often our goal is not to stabilize an unstable
laminar solution, but instead to modify the nature of the turbulent statistics. However,
linear modeling and control can be quite useful for stabilizing unstable fixed points,
for example to maintain a laminar boundary layer profile or suppress oscillations in
an open cavity (Kim & Bewley 2007, Brunton & Noack 2015). Refer to Chapter 12
for an overview of system identification and to Chapter 10 for an overview of linear
dynamical systems.
The DMD (Schmid 2010, Tu, Rowley, Luchtenburg, Brunton & Kutz 2014, Kutz,
Brunton, Brunton & Proctor 2016) is a recent technique introduced by Schmid
(2010) in the fluid dynamics community to extract spatiotemporal coherent structures
from high-dimensional time-series data, resulting in a low-dimensional linear model
for the evolution of these dominant coherent structures. DMD is an entirely data-
driven regression technique, and is equally valid for time-resolved experimental and
numerical data. Shortly after being introduced, it was shown that DMD is closely
related to the Koopman operator (Rowley et al. 2009, Mezić 2013), which is an
infinite-dimensional linear operator that describes how all measurement functions of
the fluid state will evolve in time. There have also been connections between the linear
resolvent analysis and Koopman (Sharma et al. 2016). Because the original DMD
algorithm is based on linear measurements of the flow field (i.e., direct measurements
of the fluid velocity or vorticity field), the resulting models are generally not able to
capture nonlinear transients, but are well suited to capture periodic phenomena. See
Chapter 7 for more details on DMD and Koopman operator theory.
To improve the performance of DMD, researchers are using machine learning to
uncover nonlinear coordinate systems where the dynamics appear linear (Brunton &
Kutz 2019). The extended DMD (eDMD) (Williams, Rowley & Kevrekidis 2015)
and variational approach of conformation dynamics (VAC)) (Noé & Nuske 2013,
Nüske et al. 2016) enrich DMD models with nonlinear measurements, leveraging
kernel methods (Williams, Rowley & Kevrekidis 2015) and dictionary learning
approaches (Li et al. 2017) from machine learning. Although the resulting models
are simple and linear, obtaining nonlinear coordinate systems may be arbitrarily
complex, as these special nonlinear measurement functions may not have simple, or
even closed-form solutions. Fortunately, this type of arbitrary function representation
problem is ideally suited for neural networks, especially the emerging deep learning
approaches with large multilayer networks. Deep learning is also being used to
identify these nonlinear coordinate systems, related to eigenfunctions of the Koopman
operator (Takeishi et al. 2017, Yeung et al. 2017, Lusch et al. 2018, Mardt et al. 2018,
Wehmeyer & Noé 2018, Otto & Rowley 2019). The VAMPnet architecture (Mardt
et al. 2018, Wehmeyer & Noé 2018) uses a time-lagged auto-encoder and a custom
variational score to identify Koopman coordinates on an impressive protein folding
example. The field of fluid dynamics may benefit from leveraging these techniques
from neighboring fields, such as molecular dynamics, which have similar modeling
issues, including stochasticity, coarse-grained dynamics, and massive separation of
timescales.
Figure 3.11 Turbulence is characterized by a large range of spatial and temporal scales, as
illustrated in this famous turbulence energy cascade. Resolving all scales for industrially
relevant flows has been notoriously challenging. See Pope (2000) for details.
Figure 3.12 The turbulence closure problem may be formulated as modeling the effect of high
frequency structures on the energy containing low frequency structures.
Templeton 2015, Ling, Kurzawski & Templeton 2016, Parish & Duraisamy 2016,
Xiao et al. 2016, Singh et al. 2017, Wang, Wu & Xiao 2017). Ling and Templeton
(2015) compare support vector machines, Adaboost decision trees, and random forests
to classify and predict regions of high uncertainty in the Reynolds stress tensor. Wang,
Wu and Xiao (2017) went on to use random forests to build a supervised model for the
discrepancy in the Reynolds stress tensor. Xiao et al. (2016) leveraged sparse online
velocity measurements in a Bayesian framework to infer these discrepancies. In a
related line of work, Parish and Duraisamy (2016) develop the field inversion and
machine learning modeling framework that builds corrective models based on inverse
modeling. This framework was later used by Singh et al. (2017) to develop a neural
network enhanced correction to the Spalart–Allmaras RANS model, with excellent
Figure 3.13 Comparison of standard neural network architecture (a) with modified neural
network architecture for identifying Galilean invariant Reynold stress models (b), reproduced
with permission from Ling, Kurzawski and Templeton (2016).
Fluid flow control is one of the grand challenge problems of the modern era, with
nearly limitless potential to enable advanced technologies. Indeed, working fluids are
central to many trillion dollar industries (transportation, health, energy, defense), and
even modest improvements to flow control could have transformative impact through
lift increase, drag reduction, mixing enhancement, and noise reduction. However, flow
control is generally a highly non-convex and high-dimensional optimization problem,
which has remained challenging despite many concerted efforts (Brunton & Noack
2015). Fortunately, machine learning may be considered as a growing body of data-
driven optimization procedures that are well suited to highly nonlinear and non-convex
problems. Thus, there are renewed efforts to solve traditionally intractable flow control
problems with emerging techniques in machine learning.
There is a considerable effort applying reinforcement learning (Sutton & Barto
2018) to fluid flow control problems (Beintema et al. 2020, Rabault & Kuhnle 2020),
as will be discussed more in Chapter 18. In particular, controlling the motion of
fish (Gazzola et al. 2014, Gazzola et al. 2016, Novati et al. 2017, Verma et al. 2018)
and of robotic gliders (Reddy et al. 2018) has experienced great strides with more
powerful reinforcement learning architectures. These techniques are also being used to
optimize the flight of uninhabited aerial vehicles (Kim et al. 2004, Tedrake et al. 2009)
and for path planning applications (Colabrese et al. 2017). It is believed that advances
will continue, as reinforcement learning with deep neural networks is an active area
of machine learning research, with considerable progress, for example in Alpha
Go (Mnih et al. 2015).
Genetic algorithms (Holland 1975) and genetic programming (Koza 1992) have
also been widely applied to fluid flow control (Dracopoulos 1997, Fleming &
Purshouse 2002, Duriez et al. 2017, Noack 2019). Neural networks have also been
used for flow control, with decades of rich history (Phan et al. 1995, Lee et al. 1997).
Recently, deep learning has been used to improve model predictive control (MPC)
efforts (Kaiser et al. 2018, Bieker et al. 2019) with impressive performance gains.
Finally, machine learning approaches have also been used for aerodynamic and
hydrodynamic shape and motion optimization (Hansen et al. 2009, Strom et al. 2017).
In another important vein of research, machine learning and sparse optimiza-
tion (Brunton & Kutz 2019) are being leveraged for sensor and actuator place-
ment (Manohar et al. 2018). Sensors and actuators are the workhorses of active
flow control. Developments in sensing and actuation hardware will continue to
drive advances in flow control (Cattafesta III & Sheplak 2011), including smaller,
higher-bandwidth, cheaper, and more reliable devices that may be integrated directly
into existing hardware, such as wings or flight decks. Many competing factors
impact control design, and a chief consideration is the latency in making a control
decision, with larger latency imposing limitations on robust performance (Skogestad
& Postlethwaite 2005). Thus, as flow speeds increase and flow structures become
more complex, it is increasingly important to make fast control decisions based on
efficient low-order models, with sensors and actuators placed strategically to gather
information and exploit flow sensitivities. Optimal sensor and actuator placement
are one of the most challenging unsolved problems in flow control (Giannetti &
Luchini 2007, Chen & Rowley 2011). Nearly every downstream control decision is
affected by these sensor or actuator locations, although determining optimal locations
amounts to an intractable brute force search among the combinatorial possibilities.
Therefore, the placement of sensors and actuators is typically chosen according to
heuristics and intuition. Figure 3.14 shows schematically how sensors and actuators
are used for flow control. Recent work by Manohar et al. (2018) has demonstrated
efficient, near-optimal sensor placement based on sparse optimization. This is a
promising area of development for flow control.
Applying machine learning methods to model dynamical systems from physics, such
as fluid dynamics, poses a number of unique challenges and opportunities. In physical
Actuators: Sensors:
Offline learning
Build
dictionary
basis
Feedback control
(with sparse sensors & actuators)
Figure 3.14 Schematic illustrating optimal sensor and actuator design for closed-loop
feedback fluid flow control. Learning sparse sensor and actuator locations is now possible
using convex optimization, and coherent structures in the fluid facilitate sparse sensing and
sparse actuation. The offline learning represents a slow optimization, enabling online
feedback (red).
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
h d i 0 0 9 8 088962 4 00 bli h d li b C b id i i
4 Continuous and Discrete Linear
Time-Invariant Systems
M. A. Mendez
This chapter reviews the fundamentals of continuous and discrete linear time-invariant
(LTI) systems with single-input single-output (SISO). We start from the general
notions of signals and systems, the signal representation problem, and the related
orthogonal bases in discrete and continuous forms. We then move to the key properties
of LTI systems and discuss their eigenfunctions, the input–output relations in the
time and frequency domains, the conformal mapping linking the continuous and the
discrete formulations, and the modeling via differential and difference equations.
Finally, we close with two important applications: (linear) models for time-series
analysis and forecasting and (linear) digital filters for multi-resolution analysis. This
chapter contains seven exercises, the solution of which is provided in the book’s
webpage.1
A signal is any function that conveys information about a specific variable (e.g., veloc-
ity or pressure) of interest in our analysis. A signal can be a function of one or multiple
variables; it can be continuous or discrete; and can have infinite or finite duration or
extension (that is be non-null only within a range of its domain). Signals are produced
by systems, usually in response to other signals or due to the interaction between
different interconnected subsystems. In the most general setting, a system is any entity
that is capable of manipulating an input signal and producing an output signal.
At such a level of abstraction, a vast range of problems in applied science fall
within the framework of this chapter. For a fluid dynamicist, the flow past an airfoil
is a system in which the inputs are the flow parameters (e.g., free-stream velocity and
turbulence) and control parameters (e.g., the angle of attack), and the outputs are the
drag and lift components of the aerodynamic force exchanged with the flow.
Any measurement chain is a system in which the input is the quantity to be
measured, and the output is the quantity that is measured. A hot-wire anemometer,
for example, is a complex system that measures the velocity of flow by measuring
the heat loss by a wire that is heated by an electrical current. Any signal processing
1 www.datadrivenfluidmechanics.com/download/book/chapter4.zip
technique for denoising, smoothing, and filtering can be seen as a system that takes in
input the raw signal and outputs its processed version (e.g., with enhanced details or
reduced noise).
Regardless of the number of subsystems composing a system, and whether the
system is a physical system, a digital replica of it, or an algorithm in a computer
program, the relations between input and output are governed by a mathematical
model. The derivation of such models is instrumental for applications encompassing
simulation, prediction, or control. Models can be phrased with various degrees of
sophistication, depending on the purposes for which these are developed. They usually
take the form of partial differential equations (PDEs), ordinary differential equations
(ODEs) or difference equations (DEs), or simple algebraic relation. Different models
might have different ranges of validity (hence different degrees of generalization) and
might involve different levels of complexity in their validation.
Models can be derived from two different routes, hinging on data and experimen-
tation in various ways. The first route is that of fundamental principles, based on the
division of a system into subsystems for which empirical observations have allowed
to derive well established and validated relations. In the example of the hot-wire
anemometer, the subsystems operate according to Newton’s cooling law for forced
convection, the resistive heating governed by Joule’s law, the thermoelectric laws
relating the wire resistance to its temperature, and the Ohm’s and Kirchhoff’s laws
governing the electric circuits that are designed to indirectly measure the heat losses.
Arguably, none of these laws were formulated with a hot-wire anemometer in mind.
Yet, their range of validity is wide enough to accommodate also such an application:
these laws generalize well.
In the example of the flow past an airfoil, the system is governed by Navier–Stokes
equations, which incorporates other laws such as constitutive relations for the shear
stresses (e.g., Newton’s law for a Newtonian fluid) and the heat fluxes (e.g., Fourier’s
Law for conduction). These closure relations led to the notion of fluid properties
such as dynamic viscosity and thermal conductivity and were also derived in simple
experiments that did not target any specific application.
Because of their remarkable level of generalization, we tend to see these laws
as simple mathematical representations of the “laws of nature.” Whether nature is
susceptible to mathematical treatment is a question with deep philosophical aspects.
As engineers, we accept the pragmatic view of relying on models and laws if these are
validated and useful. This is the foundation of our scientific and technical background.
The second route is that of system identification or data-driven approach, based on
the inference of a suitable model from a (usually large) set of input–output data. The
model might be constrained to a certain parametric form (e.g., with a given order in
the differential equations) or can be completely inferred from data. In the first case,
we face a regression problem of identifying the parameters such that the model fits
the data. In the second case, an algorithm proposes possible models (e.g., in Genetic
Programming, see Chapter 14) or uses such a complex parametrization that an analytic
form is not particularly interesting (e.g., in Artificial Neural Networks, see Chapter 3).
This paradigm is certainly not new. An excellent example of system identification
is the method proposed by the Swedish physicist Angström to measure thermal
Most of the material presented in this chapter can be found in classic textbooks
on signal and systems (Ljung & Glad 1994, Oppenheim et al. 1996, Hsu 2013),
signal processing (Williamson 1999, Hayes 2011, Ingle & Proakis 2011), orthogonal
transforms (Wang 2009), system identification (Ljung 1999, Oppenheim 2015), or
control theory (Sigurd Skogestad 2005, Ogata 2009). We begin this chapter by
introducing the relevant notation.
In their most general form, continuous and discrete LTI systems admit multiple inputs
and respond with multiple outputs (MIMO systems) or have a single input and respond
with a single output (SISO systems)4 . This classification and the relevant notation are
further illustrated in the block diagram in Figure 4.1.
In a continuous SISO system, inputs and outputs are denoted, respectively, as con-
tinuous functions u(t), y(t) ∈ R with t ∈ R. Following the signal processing literature,
for discrete systems, these are denoted using an index notation, as u[k], y[k] ∈ R with
k ∈ Z. In this chapter, discrete and continuous signals are assumed to be linked by a
sampling process, hence the time domain is discretized as t → tk = k∆t with an index
k ∈ Z, sampling period ∆t = 1/ fs , and (constant) sampling frequency fs . Therefore,
the notation u(tk ) is equivalent to the index notation u[k], but the second makes no
link to the sampling process nor the time axis.
In a MIMO system, both the inputs and the outputs are vectors u(t), u[k] ∈ Rn I and
y(t), y[k] ∈ RnO , with nI and nO the number of inputs and outputs. MIMO systems are
better treated in state-space representation presented in Chapter 10, hence this chapter
only focuses on SISO systems.
Signals and systems can contain a deterministic and a stochastic part. While the
focus is mostly on the deterministic part, the treatment of stochastic signals is briefly
recalled in Section 4.8.
For reasons that will become clear in Section 4.4, it is convenient to represent a signal
in a way that allows decoupling the contribution of every time instance. In other
words, we seek to define a signal with respect to a very localized basis that allows for
sampling the signal at a given time. With such a (unitary) basis, the sampling process
can be done by direct comparison: for example, we say that a continuous signal u(t)
has u(3) = 2 because at time t = 3 this signal equals two times the element of the basis
that is unitary (in a sense to be defined) at t = 3 and zero elsewhere. Mathematically,
this “comparison” process is a correlation, the signal processing equivalent of the
inner product. This notion is more easily introduced for discrete signals, considered
in Section 4.3.2. Continuous signals are treated in 4.3.1.
Note that two notations are introduced. δk is a vector of the same size of u, which
is zero except at l = k, where it is equal to one; δ[l − k] is a sequence of numbers
collecting the same information. In this sequence, k is the index spanning the position
of the impulse, while the index l spans the time (shift) domain. Infinite duration signals
are vectors of infinite length. For two such signals a[k], b[k] or vectors a, b, the inner
product for the “comparison procedure” is
t −1
nÕ
ha, bi = b † a = a[k] b[k] , (4.3)
k=0
5 Note that we will here use a “Python-like” indexing, that is, starting from k = 0 rather than k = 1.
T
where b† = b is the Hermitian transpose, with the superscript T denoting transposition
and the overline denoting conjugation. With such a basis, the value of the signal at a
specific location u[k] = uk can be written as
∞
Õ
uk = u[k] = hu, δk il or u[k] = u[l]δ[k − l] . (4.4)
l=−∞
The operation on the left is a correlation: for a given location of the impulse k, the
inner product is performed over the index l spanning the entire length of the signal and
the result is a scalar – the signal’s value at the index k. The operation on the right is a
discrete convolution and the result is a signal: the entire set of shifts will be spanned.
We shall return to the algebra of this operation in Chapter 8.
Note the flipping of the indices δ[l − k] in (4.2) to δ[k − l] in (4.4). In the first case,
the location of the impulse k is fixed and l spans the vector entries; in the second case,
within the summation, the time domain k is fixed and l spans the possible locations of
the delta functions6 .
The inner product in a vector space is the fundamental operation that allows for a
rigorous definition of intuitive geometrical notions such as the length of a vector and
the angle between two vectors. The length (l 2 norm) of a vector ||a|| and the cosine of
the angle β between two vectors a and b of equal size are defined, respectively, as
p p ha, bi
||a|| = ha, bi = b† a and cos(β) = . (4.5)
||a||||b||
In signal processing, the first quantity is the root of the signal’s energy7 , defined as
E{y} = ||y|| 2 while the second is the normalized correlation between two signals.
Signals with finite energy are square-
summable. If cos β = 0, two vectors are
orthogonal and two signals are uncor-
related; if cos β = 1, two vectors are
aligned and two signals are perfectly
correlated.
Notice that the projection of a vector
a onto a vector b (see Figure 4.2) is
ha, bi Figure 4.2 Projections and inner products.
ab = ||a|| cos(β) = . (4.6)
||b||
Hence if b is a basis vector of unitary length, the inner product equals the projection.
When this occurs, as in most of the cases presented in what follows, the notion of
inner product and projection are used interchangeably.
The basis of shifted impulses has a very special property: it is orthonormal. This
means that the inner product of two basis elements (in this case the shifted delta
functions) gives zero unless the same basis element is considered in which case we
6 This distinction is irrelevant for a symmetric function such as δ, but it is essential in the general case: if
δ[k − l] is replaced by δ[l + k] in (4.4), the operation is called cross-correlation.
7 Note that the notion of energy is used in signal processing for the square of a signal independently of
whether this is actually linked to physical energy.
recover its unitary norm (energy). We return to this property in Chapter 8. Before
moving to continuous signals, it is worth introducing another important signal that is
linked to delta functions, namely the Heaviside step function. This is defined as
( k
1 if k ≥ l Õ
uS [k − l] = that is −→ uS [k − l] = δ[r − l]. (4.7)
0 if l < k r=−∞
The difference between two shifted step functions generates a box function u Ba,b ,
which is unitary in the range k = [a, b − 1] and is zero outside. The difference between
two step functions shifted by a single step is a delta function, that is, δ[k − l] =
uS [k − l] − uS [k − l − 1]. In the discrete setting, this is equivalent to a differentiation.
Hence, the delta function is the derivative of the step function and the summation
in (4.7) shows that the step function is an integral of the delta functions. The reader
should close this subsection with an exercise on some distinctive features of discrete
signals.
This function gets narrower and taller, as σ → 0, to the point at which it becomes
infinite at t = 0 and null everywhere else while still having unit area. This is the
definition of a continuous Dirac delta function, which in its shifted form is
( ∫ +∞
0 if t , τ,
δ(t − τ) = and δ(t − τ)dt = 1. (4.10)
∞ if t = τ −∞
This function is not an ordinary one, as its integration poses several technical
difficulties. Without entering into details of measure and distribution theory (Richards
& Youn 1990), we shall accept this as a generalized function that serves well our
purpose of sampling a continuous signal. From the definition in (4.10), it is easy to
derive the sifting (sampling property):
∫ ∞ ∫ ∞
y(t) = hy(τ), δ(t − τ)i = y(τ)δ(t − τ)dτ = y(t)δ(t − τ)dτ
−∞ −∞
(4.11)
∫ ∞
= y(t) δ(t − τ)dτ = y(t).
−∞
The equivalence y(τ) = y(t) in the integral results from the product y(τ)δ(t − τ) being
null everywhere but at t; then, in the last step it is possible to move y(t) outside the
integral as this is independent from the integration domain τ.
As for the discrete case, it is interesting to introduce the unitary step function and
its link with the delta function as
(
1 if t > t0,
∫ t
uS (t − t0 ) = that is, −→ uS (t − t0 ) = δ(τ − t0 )dτ. (4.12)
0 if t < t0 −∞
Notice that the step function is not defined at t = t0 . This definition is the continuous
analogue of (4.7). To show that the delta function is the derivative of the step function,
we must introduce the notion of generalized derivative. For a continuous signal
u(t), denoting as u 0 and u(n) its first and nth derivative, integration by part using an
appropriate test function ξ(t) gives
∫ ∞ ∫ ∞
ξ(t)u(n) (t)dt = (−1)n ξ (n) u(t)dt, (4.13)
−∞ −∞
where the test function ξ(t) is assumed to be continuous and differentiable at least n
times and is such that ξ(t) → 0 for t → ±∞. The first derivative of u S is
∫ ∞ ∫ ∞ ∫ ∞
0
ξ(t) u 0S (t − t0 ) dt = − ξ 0(t)u S (t − t0 )dt = − ξ (t)
−∞ −∞ t
∫ ∞ 0 (4.14)
= ξ(t0 ) − ξ(∞) = ξ(t0 ) = ξ(t) δ(t − t0 ) dt.
−∞
The first equality results from direct application of (4.13); the second from u S (t−t0 )
being zero in t < t0 . After integration, the sifting property of the delta function (4.11)
is used, and the result is obtained by equivalence of last step with the first, holding on
the fact that ξ(t) is arbitrary.
Homegeneity:
Sc {a u(t)} = aSc { u(t)} = ay(t) for all a ∈ C,
(4.15)
Sd {a u[k]} = aSd { u[k]} = ay[k] for all a ∈ C.
Superposition:
(∫ )
∞ ∫ ∞ ∫ ∞
Sc u(t) dt = Sd {u(t)}dt = y(t)dt,
−∞ −∞ −∞
( ) (4.16)
N
Õ N
Õ N
Õ
Sd un [k] = Sd {un [k]} = yn [k],
n=1 n=1 n=1
where we considered a finite summation of N inputs for the discrete case and an
infinite summation of infinitesimally close inputs for the continuous one. Combining
these properties, we see that a linear combination of inputs results in the same linear
combination of outputs:
(∫ )
∞ ∫ ∞ ∫ ∞
Sc a(τ)u(t, τ) dτ = a(τ)Sd {u(t, τ)}dτ = a(τ)y(t, τ)dτ,
−∞ −∞ −∞
( ) (4.17)
N
Õ N
Õ N
Õ
Sd an un [k] = an Sd {un [k]} = an yn [k].
n=1 n=1 n=1
(∫ )
∞ ∫ ∞
y(t) = Sc {u(t)} = Sc u(t)δ(t − τ)dτ = u(t)Sc {δ(t − τ)}dτ
−∞ −∞
(4.19)
∫ ∞ ∫ ∞
→ y(t) = u(τ) h(t − τ)dτ = h(τ) u(t − τ)dτ .
−∞ −∞
The last two integrals are equivalent forms of the convolution integral, hinging on
its commutative property. Similarly, defining h[k] = Sd {δ[k]} the impulse responses
of a discrete system, the response to any input is
( ∞ ) ∞
Õ Õ
y[k] = Sd {u[k]} = Sd u[k]δ[k − l] = u[k]Sd {δ[k − l]}
l=−∞ l=−∞
∞ ∞
(4.20)
Õ Õ
→ y[k] = u[l] h[k − l] = h[l] u[k − l] ,
l=−∞ l=−∞
In both cases, this result shows that LTI system responds to a complex exponential
with the same input multiplied by a complex number (λd or λc ). This number solely
depends on the complex frequencies z ∈ C and s ∈ C. Therefore, these special input
functions are eigenfunctions of the LTI operators and their complex eigenvalues are
∫ ∞
λc (s) = H(s) = h(τ) e−τs dτ, (4.23a)
−∞
∞
Õ
λd (z) = H(z) = h[l] z −l . (4.23b)
l=−∞
These are, respectively, the Laplace transform and the Z-transform of the impulse
response. These are the transfer functions of the LTI systems and link input and output
in the complex frequency domain. For time varying or nonlinear systems, the notion
of transfer function is not useful. Finally, note that in discrete systems, the transfer
function is a continuous function of z.
In most applications of interest, signals are assumed to be null at time t < 0. This is
important for the impulse response of systems that are causal in which the impulse
response is h(t) = 0 for t < 0 and h[k] = 0 for k < 0. This means that no output
can be produced before the input, and hence the system is not anticipatory. Models
of physical systems and online data processing must be causal. On the other hand,
many data processing schemes operating off-line are not causal (e.g., zero-phase filters
described in Section 4.8).
As anticipated in the previous exercise, the convolution integral and summations in
(4.19)–(4.20) for causal systems become
∫ t ∫ t
y(t) = Sc {u(t)} = h(τ) u(t − τ)dτ = u(τ) h(t − τ)dτ, (4.25a)
0 0
k
Õ k
Õ
y[k] = Sd {u[k]} = h[l] u[k − l] = u[l] h[k − l] . (4.25b)
l=0 l=0
The upper limit is replaced by t or k since signals and impulse responses are null
for τ > t or k > l, and the lower one is replaced by 0 since both are null for t < 0 or
l < 0. The continuous and discrete transfer functions in (4.23) become
∫ ∞ Õ∞
λc (s) = H(s) = h(τ) e−τs dτ and λd (z) = H(z) = h[n] z−n . (4.26)
0− n=0
In the continuous case, t = 0 must be included in the integration; hence the lower
bound is tuned to accommodate for any peculiarity occurring at t = 0 (notably an
impulse). Nevertheless, to avoid the extra notational burden, the minus subscript in
the lower bound is dropped in what follows.
Finally, another important class of interest is that of stable systems. The stability
analysis of complex systems is a broad topic (see Chapters 10 and 13). Here, we limit
the focus to bounded-input/bounded-output (BIBO) stability. A system is BIBO stable
if its response to any bounded input is a bounded output. This requires that the impulse
response of continuous and discrete signals satisfies
∫ ∞ Õ∞
|h(t)|dt < ∞ and |h[k]| < ∞ . (4.27)
0 k=0
In Section 4.5, we have seen that complex exponentials are eigenfunctions of LTI
systems. Great insights on a system’s behavior can be obtained by projecting their
input–output relation onto the system’s eigenfunctions. The projection of signals into
complex exponentials leads to the Laplace transform in the continuous domain and
the Z transform in the discrete domain. This section is divided into four subsections.
We start with some definitions.
The first integral is the bilateral transform; the second is the unilateral transform.
As we here focus on causal signals (i.e., u(t < 0) = 0), these are identical. Neverthe-
less, these are different tools required for different purposes: the first is suitable for
infinite duration signals for which it can be linked to the Fourier Transform (Section
4.7); the second is developed for solving initial value problems, as it naturally handles
initial conditions.
These integrals converge, and hence the Laplace transforms exist, if u(t)e−st → 0
for t → ±∞ (only t → +∞ for the unilateral). This requires that the signal is of
exponential order, that is, grows more slowly than a multiple of some exponential:
|u(t)| ≤ Meαt . If this is the case, the range of values R{s} > α is the region of
convergence (ROC) of the transform.
The reader is referred to Beerends et al. (2003), Wang (2009), Hsu (2013) for a
review of all the properties of the Laplace transform; we here focus on the key oper-
ations enabled by this powerful tool and we omit formulation of the inverse Laplace
transform, as it requires notions of complex variables theory that are out of the scope of
this chapter. The Python script E X3. PY for solving Exercise 3 provides the commands
to compute both the transform and its inverse using the Python library S YM P Y9 .
The key property of interest in this chapter is that of time derivation, which can be
easily demonstrated using integration by parts. The bilateral (L b ) and unilateral (L u )
transforms of a time derivative are
L b {u 0(t)} = sU(s) and L u {u 0(t)} = sU(s) − u(0) . (4.29)
That is differentiation in the time domain corresponds to multiplication by s in the
frequency domain; similarly, one can show that integration in the time domain corre-
sponds to division by s. Notice that no distinction between L b and L a is needed for a
system initially at rest and the bilateral transform cannot handle initial conditions.
Finally, compare the inner product in (4.28) with (4.8), taking a(t) = u(t). Note
that the Laplace transform is a projection of the signal u(t) onto an exponential basis
b L (t, s) = e−σ+jω . It is left as an exercise to show that the Laplace basis is not
orthogonal, unless σ = 0. This is the basis of the continuous Fourier transform.
Z Transform.
Given a discrete and causal signal u[k], the Z transforms are
+∞
Õ ∞
Õ
U(z) = Z{u[k]} = u[k]z−k = u[k]z −k . (4.30)
k=−∞ k=0
With e−s tk = e−s k∆t = z k , the Laplace transform of this discrete signal is
∫ ∞ Õ ∞
L{u[k]} = u[k]δ(t − l∆t)e−st dt
−∞ l=−∞
∫ ∞ Õ ∞
= u[k]δ(t − l∆t)z−k dt (4.33)
−∞ l=−∞
∞
Õ ∫ ∞
= u[k]z −k
δ(t − l∆t)dt = Z{u[k]}.
l=−∞ −∞
Figure 4.3 The conformal mapping linking the Laplace and the Z transforms.
( k
)
Õ
Y (k) := Z{y[k]} = Z h[l] u[k − l] = H(z)U(z) . (4.34b)
l=0
Differential Equations.
The general form of the LCCDE of a continuous SISO LTI system with input u(t) and
output y(t) reads
Nb
Õ Nf
Õ
an y (n)
(t) = bn u(n) (t), (4.35)
n=0 n=0
where N f ≥ Nb is the order of the system11 . The coefficients an are called feedback
coefficients; the coefficients bn are feedforward coefficients.
The LCCDE provides an implicit representation of a system since the input–
output relation can be revealed only by solving the equation. Introducing the Laplace
transform in a LCCDE is an operation similar to the Galerkin projection underpinning
reduced-order modeling (ROM, see Chapters 1 and 14). Recalling that the Laplace
transform is a projection onto the basis b L (t), (4.35) leads to:
Nb
DÕ Nf
E DÕ E
an y (n) (t), b L (t) = bn u(n) (t), b L (t)
n=0 n=0
Nb
Õ n o ÕNf n o
→ an L y (n) (t) = bn L u(n) (t)
n=0 n=0
(4.36)
Nb
Õ Nf
Õ
→ Y (s) an s (n)
= U(s) bn s (n)
,
n=0 n=0
ÍN f ÎN f
Y (s) b s(n)
n=0 n
bN f n=0
(s − zn )
H(s) = = ÍN = .
U(s) b
an s(n) a Nb Î N b
(s − pn )
n=0 n=0
The transfer function of LTI systems is a polynomial rational function of s, with the
coefficients of the polynomials being the coefficients of the LCCDE. In the factorized
form, zn and pn are, respectively, the zeros and the poles of the system12 . Note that
since the coefficients an , bn are real, these can either be purely real or appear in
complex conjugate pairs. These coefficients have a straightforward connection with
the LCCDE, which can immediately be recovered from the transfer function. The
zeros zn are associated to inputs ezn t in which the transfer function is null and thus
leads to no output; the poles pn corresponds to resonances, inputs e pn t in which the
transfer function is infinite and leads to the blowup of the system.
The poles are eigenvalues of the matrix A, advancing a linear system in its state-
space representation (see Chapters 10 and 12). In a stable system, poles are located in
regions of the s-plane that are “not accessible” by any input, that is, outside the ROC
of the transfer function. Defining the ROC of H(s) as R{s} > α, and observing that
the poles are by definition outside the ROC, stability is guaranteed if α = 0, that is, if
the ROC includes the imaginary axis. This is equivalent to imposing that all the poles
are located in the left side of the s-plane, that is, R{pn } < 0 ∀n ∈ [0, N f ]. This result
can also be derived from the BIBO stability condition in (4.27).
Difference Equations.
In the discrete case, the general form of LCCDE associated to SISO LTI systems with
input u[t] and output y[t] reads
Nb
Õ Nf
Õ Nb
Õ Nf
Õ
an y[k −n] = bn u[k −n], that is, y[k] = b∗n u[k −n]− an∗ y[k −n]. (4.37)
n=0 n=0 n=0 n=1
12 A transfer function that has more zeros than poles (i.e., N f > Nb ) is said to be improper. In this case,
lim s→∞ |H(s)| = +∞, which violates stability: this implies that at large frequencies, a finite input can
produce an infinite output. Moreover, after the polynomial division, the transfer function brings
polynomial terms in s. The inverse Laplace transform of these are (generalized) derivatives of the delta
functions; hence the corresponding impulse response h(t) = L −1 {H(s)} violates causality.
The order of the system13 is max(Nb , N f ). The form on the right plays a fundamental
role in filter implementation, time-series analysis, and system identification and is
known as recursive form of the differnce equation. Note that the feedback and
feedforward coefficients in the recursive form are simply an∗ = an /a0 and b∗n = bn /a0 ,
respectively; hence the coefficient b∗0 is the static gain of the system.
As for the continuous case, projecting (4.37) onto the Z basis b Z (t) via the Z
transform and using (4.31) yields the transfer function of a discrete system:
Nb
DÕ Nf
E DÕ E
an y[k − n], b Z [k] = bn u[k − n], b Z [k]
n=0 n=0
Nb
Õ n o ÕNf n o
→ an Z y[k − n] = bn Z u[k − n]
n=0 n=0
(4.38)
Nb
Õ Nf
Õ
→ Y (z) an z −n
= U(z) bn z −n
,
n=0 n=0
Í N f −1 Î N f −1
Y (z) n=0
bn z−n
b0 n=0
(1 − ζn z−1 )
H(z) = = Í N −1 = z Nb −N f Î N ,
U(z) b
an z−n a0 b −1
(1 − πn z−1 )
n=0 n=0
where ζn and πn are, respectively, the zeros and poles of the discrete transfer function.
Observe that the factored form of the discrete transfer function is usually given in
terms of polynomials of z −1 rather than z.
The link between zero and poles in continuous and discrete domains is given by
the conformal mapping in Figure 4.3. In the absence of inputs, the poles control the
evolution of a linear system from its initial condition (i.e., the homogeneous solution
of the LCCDE). The Dynamic Mode Decomposition (DMD) introduced in Chapter 7
is a powerful tool to identify the poles πn of a system from data, and to build linear
reduced-order models by projecting the data onto the basis of eigenfunctions z πn k .
Finally, in analogy with the continuous case, a discrete system is stable if its poles
are outside the ROC of the transfer function. Defining the ROC of H(z) as |z| ≥ α,
one sees that this occurs if α = 1: the ROC includes the unit circle and hence all the
poles have |πn | < 1. This can be derived from the BIBO stability condition in (4.30).
Consider the system in Exercise 2. Compute the transfer function and the
system output from the frequency domain, then identify the LCCDE governing
the input–output relation. Then, assuming that a discrete system is obtained
sampling the continuous domain, derive a recursive formula that mimics the
input–output link of the continuous system. Test your result for a sampling
frequency of fs = 3 Hz and fs = 10 Hz.
13 Note that in (4.37) the restriction Nb ≥ N f is not needed to enforce causality: by construction, the
output y[k] only depends on past information.
BIBO stability guarantees that the output produced by a stationary input is also
stationary. It is thus interesting to consider only the portion of the complex planes s and
z associated to infinite duration signals, that is, s = jω and z = ejθ . These correspond
to harmonic eigenvalues of the LTI system, hence lead to a harmonic response. From
Laplace and Z transforms, we move to continuous and discrete Fourier transforms
in Section 4.7.1. A system that manipulates the harmonic content of a signal is a
filter; these are introduced in Section 4.7.2 along with their fundamental role in multi-
resolution decompositions.
+∞
Õ
U(jθ) = FD {u[k]} = u[k]e−jθ k . (4.39b)
k=−∞
These are the continuous (CT) and the discrete (time) Fourier Transforms (DTFT).
Both are continuous functions, with the second being periodic of period 2π because
of the conformal mapping introduced in Figure 4.3. Comparing these to (4.28) and
(4.30) shows that the bilateral Laplace and Z transforms are the Fourier transforms
of u(t)e−σt and u[k]ρ−k . Without these exponentially decaying modulations, the
conditions for convergence are more stringent: signals must be absolutely integrable
and absolutely summable14 .
The main consequence is that infinite duration stationary signals do not generally
admit a Fourier transform. This explains why the manipulations of these signals by
an LTI system are better investigated in terms of some of their statistical properties,
such as autocorrelation or autocovariance, as illustrated in Section 4.8. A special
exception are periodic signals for which (4.39a) and (4.39b) lead to Fourier series,
and the problem of convergence becomes less stringent.
In stable continuous and discrete LTI systems, satisfying (4.30), the impulse
response always admits Fourier transform: these can be obtained by replacing s = jω
and z = ejθ in the transfer function. This leads to frequency transfer functions, which
are complex functions of real numbers15 (ω or θ), customarily represented by plotting
log(|H(x)|) and arg(H(x)) versus log(x) in a Bode plot, with x = ω or x = θ.
14 This condition is sufficient but not necessary: some non-square integral functions do admit a Fourier
transform. Important examples are the constant function u(t) = 1 or the step function uS (t). Moreover,
note that the Fourier transform can be obtained from the Laplace and Z transform, only for signals that
are absolutely integrable or summable. For instance, the Laplace transform of e αt with α > 0 has ROC
s > α, while the Fourier transform does not exist.
15 These are often called real frequencies as opposed to the complex frequencies s and z.
The modulus of the frequency transfer function is the amplitude response; its argument
is the phase response.
If the Fourier transform (or series) exists for both inputs and outputs, the properties
of the Laplace and Z transform apply: the harmonic contents of the output is Y (jω) =
H(jω)U(jω) in the continuous domain; Y (jθ) = H(jθ)U(jθ) in the discrete one.
Discrete signals of finite duration u(tk ) = u ∈ Rnt ×1 = u[k], with k ∈ [0, nt − 1], are
usually extended to infinite duration signals assuming periodic boundary conditions.
The frequency domain is thus discretized into bins θ n = n∆θ with n ∈ [0, n f − 1]
and ∆θ = 2π/nF . The mapping to the continuous frequency domain, from Figure 4.3,
gives fn = 2πωn = n fs /n f , with fs = 1/∆t the sampling frequency. With both time
and frequency domain discretized, the Fourier pair are usually written as
nt −1 n f −1
1 Õ −2πj nn k 1 Õ 2πj n k
U[n] = √ u[k]e f ⇐⇒ u[k] =
√ U[n]e n f . (4.40)
nt k=0 nt n=0
The equations in (4.49) are, respectively, the discrete Fourier transform (DFT) and
√
its inverse. Note that the normalization 1/ nt is used for later convenience: we see
in Chapter 8 that (4.49) can be written as matrix multiplications with the columns
of the matrix being orthonormal vectors. Finally, if n f = nt and nt is a power of 2,
this multiplication can be performed using the famous FFT (fast Fourier transform)
algorithm (see Loan 1992), reducing the computational cost from nt2 to nt log2 (nt ).
An excellent review of the DFT is provided by Smith (2007b).
Consider the discrete system derived in Exercise 3, but now assume that the
static gain is unitary. Compute the frequency transfer function of this system
and show that this can be seen as a low-pass filter. Study how the frequency
response changes if the coefficients a1 or a2 are set to zero. Finally, derive
the system that should have the complementary transfer function and show its
amplitude response. Is this response also complementary?
In MRA, filters are used to decompose signals. While the DFT represents a signal as
a linear combination of harmonics, MRA represents it as a combination of frequency
bands called scales. A packet of similar frequencies can be assembled into bases called
wavelets, hence the connection to Chapter 5.
The MRA partitions the spectra of signal into n M scales, each taking a portion of
the signal’s content in bands [0, f1 ], [ f1, f2 ], . . . , [ fM−1, fs /2]. In Chapter 8, these will
be identified by a frequency splitting vector FV = [ f1, f2, f3, . . . , fM−1 ].
The MRA of a discrete signal can be written as
M
(M ) M
Õ Õ Õ
u[k] = sm [k] = F −1 Hm ( fn )U( fn ) with |Hm ( fn )| = 1 , (4.41)
m=1 m=1 m=1
where sm is the portion of the signal in the scale m, within the frequency range
fn ∈ [ fm−1, fm ] and Hm ( fn ) is the transfer function of the filter that isolates that
portion. Therefore, |Hm ( fn )| ≈ 1 for fn ∈ [ fm−1, fm ] and |Hm ( fn )| ≈ 0 otherwise.
The assumption on the right enables a lossless decomposition.
The MRA requires the definition of one low-pass filter for the range [0, f1 ], one
high-pass filter for the range [ fM−1, fs /2], and M − 2 band-pass filters. Because these
are complementary, all these filters can be obtained from a set of low-pass filters, as
described at the end of this section. Therefore, to learn MRA, one should first learn
how to construct a low-pass filter with a given cutoff frequency fc .
We now focus on the two main families of filters and the most common design
methods. Let us consider a specific example, with fs = 2k Hz and fc = 200 Hz. In the
digital frequency domain, we map the sampling frequency to θ s = 2π and the cutoff
to θ c = π/5.
The transfer function of the ideal low-pass filter is
e−jαd θ if |θ| ≤ θ c ,
(
Hid (jθ) = (4.42)
0 if θ c < |θ| ≤ π.
1.0 0.2
Ideal Ideal
IIR IIR
0.8 FIR FIR
0.1
|H(fn )|
0.6
h(tk )
0.4
0.0
0.2
0.0
−0.1
0 100 200 300 400 0.00 0.01 0.02 0.03 0.04 0.05
f [Hz ] tk [s]
Figure 4.4 (a): Absolute value of the frequency response of an ideal (black continuous), an IIR
(red dash-dotted) and an FIR (blue dashed) low-pass filter. (b): impulse responses associated to
the frequency responses on the left. In the case of the FIR, the response is shifted backward by
αd .
Oppenheim & Schafer 2009). These filters have no zeros and N poles, with N the filter
order, equally spaced around the unit circle.
Once these poles are computed, the continuous frequency response function can
be readily obtained in its factor form and the last step consists in identifying the
associated recursive formula as in Exercise 4. However, note that mapping from s to z
is usually performed using the bilinear transform16 rather than the standard mapping
z = e∆ts that is used in Exercise 4, since this has the advantage of mapping fs /2 to
π and thus prevents aliasing. The nonlinearity in the bilinear transform results in a
wrapping of higher frequencies so the correct cutoff frequency should first be pre-
warped to account for the distortion in the frequency calculation17 .
Software packages such as S CI P Y in P YTHON or M ATLAB offer the functions
BUTTERWORTH to design a Butterworth filter with given order and cutoff frequency
(see P YTHON script E X 5. PY).
The red curves in Figure 4.4(a) show the amplitude response and the impulse
response of a Butterworth filter of order 11. The main advantage of these filters is their
capability of well approximating the ideal filter using a limited order, which requires
storing few coefficients in their recursive formulation. On the other hand, these filters
tend to become unstable as the order increases (and the poles approach the unit circle).
Moreover, their phase delay is generally not constant, and this potentially introduces
phase distortion. Finally, note that since the impulse response of these filters is infinite,
these cannot be implemented in the time domain via simple convolution, but via the
recursive solution of the filter’s LCCDE.
16 which reads
!
2 1 − z −1 1 + ∆t/2 s
s= ←→ z = .
∆t 1 + z −1 1 − ∆t/2 s
17 The pre-warp can be achieved using fc0 = fs /π tan π fc / fs . Therefore, if the desired cutoff frequency
is fc = 200 Hz with a sampling frequency fs = 1000 Hz, the filter should target a cutoff frequency of
fc0 = 231.26 Hz to compensate for the warping due to the bilinear transform.
that because of the linearity of the convolution, the impulse responses of complemen-
tary high-pass (h H ) and low-pass (h L ) filters are linked by18 δ[n] = h L [n] + h H [n].
Finally, we close with the practical implementation of MRA in “off-line” conditions
for which it is possible to release the constraints of causality and use zero-phase filters.
These are usually implemented by operating on the signal twice (first on u[k] and then
on u[−k]), to artificially cancel the phase delay of the operation. In S CI P Y and in
M ATLAB this is performed using the function FILTFILT.
If the phase delay is canceled, complementary filters can be computed by taking
differences of the frequency transfer functions (which become real functions). There-
fore, if the first scale with band-pass [0, f1 ] is identified by the frequency transfer
function H1 ( f ) = H L ( f , f1 ), the second scale with band-pass [ f1, f2 ] is identified by a
filter with transfer function H2 ( f ) = H L ( f , f2 ) − H L ( f , f1 ). The transfer function of
the general band-pass filter is Hm ( f ) = H L ( f , fm ) − H L ( f , fm−1 ), while the last scale
is identified by the high-pass filter with HM ( f ) = 1 − H L ( f , fM ).
This set of cascaded filters is known as a filter bank and is at the heart of the pyramid
algorithm for computing the discrete wavelet transform (Strang 1996, Mallat 2009),
where it is combined with sub-sampling at each scale. The general architecture of this
decomposition is summarized in Figure 4.4. Observe that at the limit at which all the
frequency bands become unitary, the MRA becomes a DFT.
Figure 4.5 Pyramid-like algorithm to compute the MRA of a signal. Each band-pass scale is
computed as the difference of two low-pass filters and the terms in blue are preserved to form
the summation in (4.41). The graph on the right shows a pictorial representation of the
partitioning of the signals spectra.
18 Note that this is not the only method of obtaining a high-pass filter from a low-pass filter: another
approach is to reverse the frequency response HL , flipping it from left to right about the frequency fs /4
for f > 0 and from right to left about − fs /4 for f < 0 (see Smith (1997)). The impulse response of the
resulting high-pass filter is hH [n] = (−1) n hL [n]. The two methods are equivalent if the cutoff
frequency separating the transition bands is fs /4. This is the case encountered when performing MRA
via dyadic wavelets as discussed in Chapter 5.
LTI systems are the simplest model in time-series analysis and forecasting. In these
applications, treating signals and systems as fully deterministic is too optimistic, and
it is thus essential to consider stochastic signals: predictions have a certain probability
range (see Guidorzi 2003, Brockwell & Davis 2010). This section briefly reviews the
main features of stochastic signals and systems in Section 4.8.1. Section 4.8.2 reviews
the basic tools for forecasting, using classic linear regression. Only the discrete
domain is considered. More advanced techniques are discussed in Chapter 12.
spectra requires some adaptation, as stochastic signals do not generally admit a Fourier
transform and focus must be placed on properties that are deterministic also in a
stochastic signal. These are the statistical properties.
Accordingly, the time-invariance in LTI systems is extended in terms of invariance
of the statistical properties. This is linked to the notion of stationarity. Stationarity
can be weak or strong. A stochastic signal v[k] is stationary in a strict sense (strong
stationarity) if its distributions remain invariant over time. Weak stationarity (or
stationarity in a wide-sense) requires that only its time average µv and autocorrelation
rvv [m] of a signal v are time-invariant. These are defined as
µv = E{v[k]} and rvv [m] = E{v[k]v[k + m]} , (4.46)
with E the expectation operator.
In the analysis of the LTI system’s response to stochastic signals, the link between
a specific input and the corresponding output is not particularly interesting. Instead,
we focus on the link between the statistical properties of the input and the output. In
particular, we consider how the properties in (4.46) are manipulated. Let yv [k] be the
response of the system to the stochastic signal uv [k]. The expected (time average of
the) output µy is
( ∞ ) ∞ ∞
Õ Õ Õ
µy = E h[l]uv [k − l] = h[l]E{uv [k]} = µv h[l] . (4.47)
l=−∞ l=−∞ l=−∞
The sequence rhh [k] is the autocorrelation of the impulse response and the operation
on the left is a convolution. In words: the autocorrelation of the output is the
convolution of the autocorrelation of the input with the autocorrelation of the impulse
response. This equation extends the convolution link in (4.21) to the autocorrelation
functions. These functions admit Fourier transform, so the convolution theorem can
be used to see the link in the frequency domain:
Ryy (jω) = Chh (jω)Ruu (jω) , (4.49)
where Ryy (jω), Chh (jω) and Ruu (jω) are the Fourier transform of ryy [k], rhh [k], and
ruu [k], respectively. These are the power-spectral densities of y, h and u. Hence we
see that an LTI system acts on the frequency content of the autocorrelation function of
a stochastic signal.
This chapter reviewed the fundamentals of signals and systems and presented LTI
systems in case of SISO. We have seen that the input–output relation can be derived
from knowledge of the impulse response of a system and via the convolution integral.
It was shown that complex exponentials are eigenfunctions of these systems and
that important transforms can be derived by projecting input and output signals onto
these eigenfunctions. In the eigenspace of the LTI systems, convolutions become
multiplications.
22 readers familiar with Lagrangian multipliers should recognize in (4.53) an augmented cost function
with α the Lagrangian multiplier.
Spectral analysis is the cornerstone of some of the most celebrated turbulent flow the-
ories. When focusing on the identification and modeling of coherent flow patterns, on
the other hand, the need of localizing the spectral content in time or space often arises.
Time–frequency analysis encompasses an arsenal of techniques aimed to localize the
spectral content of a signal in time (although for velocity fields we could clearly refer
to space and spatial frequencies). This chapter gives an overview of techniques for
time–frequency analysis. We start the journey with the windowed Fourier transform
(WFT) in continuous and discrete form, and introduce the concepts of shifting and
stretching to localize a variety of scales. The bounds imposed by the uncertainty
principle will clearly arise. Then, we adapt to those bounds introducing the continuous
wavelet transform (CWT) and its link to multi-resolution analysis (MRA). The
fundamentals of the discrete wavelet transform (DWT) and its intimate relation with
filter banks will be outlined. Finally, two applications are presented: a time–frequency
analysis of hot-wire data in a turbulent pipe flow using the WFT and the CWT; filtering
and compression of velocity fields using a filter based on the DWT. MATLAB codes
to practice with the provided examples are provided on the book’s webpage.1
5.1 Introduction
The Fourier transform (FT) of a function f (t) is an integral transform representing its
projection into harmonic functions. From the inner product introduced in Chapter 4,
the FT is defined as follows:
∫ +∞
F (ω) = f (t)e−jωt dt. (5.1)
−∞
The FT is a powerful and efficient technique to solve a wide variety of problems,
ranging from data analysis and filtering, image compression, communications, and
so on (Smith 2007b). Nonetheless, the FT has severe limitations when applied to
non-stationary signals. Indeed, the FT describes the signal’s frequency content, but
does not provide time localization, i.e., it does not determine the time in which a
given frequency occurs in the signal. This information is lost because of the infinite
integration bounds in (5.1) and the infinite support of the harmonic basis e−jωt .
1 www.datadrivenfluidmechanics.com/download/book/chapter5.zip
Consider, for example, the signal x(t) represented in Figure 5.1, with its (discrete)
FT normalized with respect to its maximum and represented as a function of the
frequency ν = ω/2π. The signal exhibits two different frequencies in well-localized
time intervals. Even though a clear signature of the two main frequencies is present in
the spectrum, no information is retained on the time localization of those frequencies.
Although this example might appear extreme at first glance, there is a wide range
of applications where the localization in time is relevant, for example, timescale
modification, sinusoidal modeling of musical compositions, cross-synthesis, and so
on. In fluid mechanics, the localization of scales in space-time has strong potential
implications in the identification and modeling of the flow dynamics (Farge 1992,
Schneider & Vasilyev 2010). For such cases, different implementations are sought,
possibly maintaining the appeal of computationally efficient algorithms such as the
fast Fourier transform (FFT, Cooley & Tukey 1965).
Chapter 4 introduced the limits of the FT and the MRA. This chapter focuses
on techniques for time–frequency analysis of signals, including a discussion of the
capabilities and limits of the WFT, and the wavelet-based multi-resolution methods.
The relation between the DWT and filter banks is also briefly discussed. The objective
is to introduce the fundamental concepts of time–frequency analysis, making no pre-
tence of being exhaustive. For a more comprehensive mathematical background, the
reader is referred to several excellent literature contributions on the topic (Strang 1989,
Daubechies 1990, Daubechies 1992, Cohen 1995, Strang & Nguyen 1996, Torrence
& Compo 1998, Van den Berg 2004, Mallat 2009, Kaiser 2010, Chui 1992).
Figure 5.1 Example of non-stationary signal (top) and corresponding normalized Fourier
spectrum (bottom).
where the overbar indicates the complex conjugate. The window function g(t) is often
selected to be a Gaussian function:
1 1 t 2
g(t) = √ e− 2 ( α ) . (5.3)
α 2π
The parameter α can be used to stretch the window along the time axis. The window
can be translated over the time sequence with the shifting parameter τ to obtain the
temporal description.
The Gabor transform shares the same linearity properties of the FT and can be
inverted with the relation
∫ +∞ ∫ +∞
1
f (t) = G(τ, ω)g(t − τ)ejωτ dωdτ (5.4)
2π −∞ −∞
sin(2πν1 t) if 1 ≤ t ≤ 3,
x(t) = 0.5sin(2πν2 t)
if 4 ≤ t ≤ 5, (5.5)
0
otherwise,
with ν1 = 3 and ν2 = 10 being the frequencies of the two portions of the signal.
Figure 5.2 Example of application of the Gabor transform on the signal in (5.5). In both cases
α = 0.5, while τ = 2.0, 4.5 for the left and the right column, respectively. From top to bottom:
raw signal with superposed sliding window of the Gabor transform in thick black line;
premultiplied signal; Fourier spectrum of the signal premultiplied by the Gaussian kernel at
τ = 2.0, 4.5.
This is the same signal analyzed with FT in Figure 5.1. A Gaussian kernel with
α = 0.5 is here chosen. The selected kernel is expected to have sufficient width to
capture the low-frequency oscillations and to be reasonably well localized to identify
the high-frequency oscillations occurring on a shorter time segment. The Gabor kernel
is then centered at τ = 2.0 and τ = 4.5, i.e., at the center of the regions interested
by the two sinusoidal oscillations. Observing the spectrum of the Gabor transform
coefficients, the two frequencies are well localized both in the frequency and time
domain, with a single sharp peak in the spectrum at ν = 3 and ν = 10 at the two
selected time instants, respectively.
Figure 5.3 Graphical explanation of the different description of the signal in the time domain,
frequency domain, and with the spectrogram obtained by the Windowed Fourier Transform.
n ∈ N, with τ0 being a fundamental time unit. With the sampling frequency being νs ,
the time grid is t = k∆t, with ∆t = 1/νs and k = 0, . . . , N − 1. The corresponding
discrete Gabor transform is thus given by
N
Õ −1
Gm,n = fk g(k∆t − nτ0 )e−j2πmν0 , (5.6)
k=0
Figure 5.4 Spectrogram of the signal described by (5.5) for two different values of the scaling
parameter α = 0.1 (left) and α = 10.0 (right).
Consider the signal x(t), t ∈ [0, 2π] described by (5.5) and sampled with N =
2 048 points.
1. Compute the spectrum of the signal using the DFT.
2. Compute the spectrogram of the signal using the WFT with Gaussian kernel
(as in (5.3)). Explore the effect of the standard deviation α of the kernel on
the time and frequency resolution.
Solution
The solution is provided in the form of MATLAB code in the supplementary
material. The lattice for time and frequency is set as τ0 = ∆t, ν0 = νs /N. The
reader can modify the script accordingly to generate Figures 5.1 and 5.4.
5.3.1 Fundamentals
As discussed in Section 5.2, the main shortcoming of the WFT is the compromise to
be sought between time and frequency resolution, i.e., the capability of localizing
features in the time and frequency domains. The resolution can be maximized by
choosing windows with the minimal area of the corresponding Heisenberg box
in the time–frequency plane. However, if our analysis is restricted to one single
window choice, then only one scale is selected (within the uncertainty of time–
frequency localization from the Heisenberg principle). To achieve a well-resolved
description in a set of scales, a process that uses a library of windows spanning
the time–frequency domain should be conceived. “Broad” windows, with excel-
lent frequency localization but poor temporal resolution, can be used to describe
the low-frequency part of the signal. For such scales, the time localization is
(a) (b)
1 1
0.5 0.5
0 0
–0.5 –0.5
–1 –1
–2 0 2 4 –2 0 2 4
(c) (d)
1
0.5
0
–0.5
–1
–2 0 2 4
Figure 5.5 Haar mother wavelet (a), with examples of scaled (b), shifted (c), and
scaled–shifted (d) versions.
less relevant since low-frequency signals are by definition less localized in time.
“Narrow” windows, on the other hand, can deliver excellent time localization for the
high-frequency scales. This multi-resolution approach is the cornerstone of wavelet
theory.
The main ideas behind the wavelet concept are the scaling and shifting processes,
already introduced in the WFT. In wavelet theory, the starting point is a function
referred to as “mother wavelet” ψ(t), which is shifted along the signal to provide time
localization, and scaled to capture scales of different size. The family of wavelets is
thus generated as follows:
1 t−b
ψa,b (t) = √ ψ . (5.10)
a a
The variables a and b are, respectively, the scaling and shifting parameters. The
√
term 1/ a is introduced to obtain functions with unitary norm. The wavelet is then
shifted across the time domain and progressively scaled to create a collection of time–
frequency descriptions of the signal, i.e., a multi-resolution analysis.
An illustrative example of this process is shown in Figure 5.5. The earliest example
of a wavelet was the Haar wavelet, introduced in 1911 by Alfred Haar (1911) as the
Haar sequence (the term wavelet appeared only several decades later).
(a) (b)
(c) (d)
Figure 5.6 Haar (top row) and Mexican-hat (bottom row) wavelets: time representation (left
column) and corresponding frequency spectra (right column).
It is defined as follows:
1 if 0 ≤ t < 1/2,
ψ(t) = −1 if 1/2 ≤ t < 1,
(5.11)
0
otherwise.
Figure 5.5 includes the mother wavelet ψ1,0 , a scaled Haar wavelet ψ2,0 (i.e., with
scaling parameter a = 2, thus enlarging the support of the wavelet and, consequently,
the corresponding scale), a shifted Haar wavelet ψ1,1 (i.e., centered in t = 3/2, being
b = 1) and a scaled-shifted Haar wavelet ψ2,−1 .
The Haar wavelet has the disadvantage of being noncontinuous (although this is not
an issue for the discrete formulation); nonetheless, this turns out to be an advantage for
the description of signals with sharp changes. This feature is particularly appreciated,
for instance, for edge detection in image processing.
The Haar wavelet is highly localized, thus giving good time localization but poor
frequency resolution. This is a direct consequence of the uncertainty principle outlined
previously: a compact support in the time-domain results in a broadband frequency
spectrum (see Figure 5.6(b)).
Depending on the application and on the desired properties of time–frequency
resolution, there is a vast variety of mother wavelets already available in the literature
and in software toolboxes. A classic example is the Mexican-hat wavelet:
" 2#
2 t−b 2
− (t −b)
ψa,b (t) = √ 1− e a 2 . (5.12)
3aπ 1/4 a
The Mexican-hat wavelet and its FT are shown in Figure 5.6(c,d). The Gaussian
kernel, which has optimal time–frequency bandwidth product in (5.7), enables the
best compromise for the localization in the time and the frequency domains. From the
comparison, it is clear that while the spectrum of the Haar wavelet decays with ν −1 ,
the frequency spectrum of the Mexican-hat wavelet is much sharper.
Notice that, similarly to the Gabor transform (5.2), the CWT depends on two
parameters, the scale a and the shift b. Additionally, as in the WFT, a wide variety
of CWTs can be defined by selecting the proper wavelet ψ for the desired application.
The main constraint for the selection of the wavelet is the admissibility condition, i.e.,
∫ +∞
| ψ̂(ω)| 2
Cψ = dω < ∞, (5.14)
−∞ |ω|
The existence of the inverse CWT clearly relies on the admissibility of the wavelet,
i.e., Cψ < ∞.
It is important to underline here that we shifted from the time–frequency repre-
sentation of the Gabor transform to the timescale representation of the CWT. Here
two important remarks are needed. First, the relation between scales and frequency
might not be immediate, considering that often wavelets have irregular shape (see
for instance the Daubechies wavelets, Daubechies 1992). A simple method to extract
such relation has been proposed by Meyers et al. (1993). It is based on convolution
of the wavelet with a cosine wave, and searching for the scale a that maximizes
the correlation. This is a useful exercise for wavelets where a dominant frequency
Figure 5.7 Graphical explanation of the difference between the time/frequency description of
the windowed Fourier transform (a) and the timescale description of the wavelet analysis (b).
where h·, ·i is the inner product in L 2 and δi, j is the Kronecker delta
(
1 i = j,
δi, j = (5.19)
0 otherwise.
In case of decomposing with orthogonal wavelets, a given function f (t) can be
uniquely defined by its Discrete Wavelet Transform, i.e.,
• Vm ⊂ Vm+1, ∀m ∈ Z
• ∪+∞ 2
m=−∞ Vm spans L (R)
• ∩+∞
m=−∞ Vm = {0}, i.e., the only intersection between the subspaces is the null
function
• if f (x) ∈ Vm , then f (2x) ∈ Vm+1, ∀m ∈ Z
• there exists a function φ in V0 , called a scaling function or father wavelet, such
that φ(x − n) with n ∈ Z is an orthonormal basis of V0 with respect to the inner
product in L 2 (R).
W0 = { f ∈ V1 : h f , gi = 0 ∀g ∈ V0 }. (5.24)
One simple solution for the previous equation is to set an = (−1)n c1−n . We can now
define the corresponding mother wavelet
Õ
ψ(x) = (−1)n c1−n φ(2x − n), (5.27)
n∈Z
which generates the family of orthogonal wavelets corresponding to the MRA defined
earlier. Interestingly enough, if we interpret the scaling function as a low-pass filter,
multiplying by (−1)n its impulsive response leads us to define its complementary high-
pass filter.
As discussed in Chapter 4, this is true because the dyadic construction in (5.22)
gives to these complementary filters equal portions of each scale’s spectra.
Figure 5.8 Conceptual sketch with Venn diagram of a three-level MRA. The spaces V0, V1 ,
and V2 of the MRA are indicated with the corresponding symbol located on their boundary.
The respective orthogonal complements W0 and W1 are indicated with the symbol located
within the corresponding domain.
with I A being the indicator function, equal to 1 for all elements belonging to A and 0
elsewhere, and n ∈ Z.
It can be easily shown that the functions φ0,n are orthogonal, and that they span the
entire V0 with a set of integer translation of φ0,n = φ(x − n). This function is the Haar
scaling function.
If we consider now the piecewise approximation at level V1 , this is carried out by
functions that are piecewise constant over length √ 1/2. In this space, an orthonormal√
basis is constituted by the functions φ1,n (x) = 2I[n/2,(n+1)/2) (x) (with the factor 2
to account now for the shorter support of the functions). Intuitively, these functions
can be obtained by scaling and shifting the function φ0,n , thus obtaining
√
φ1,n (x) = 2φ(2x − n). (5.29)
This is the same result of (5.23) outlined earlier. From (5.23), for the Haar system
we obtain
( √
1/ 2 if n = 0, 1,
cn = (5.30)
0 otherwise.
From (5.27), we can simply derive the corresponding mother wavelet, which is the
discrete form of the Haar wavelet ψ reported in (5.11). It is straightforward to show
that φ and ψ are orthogonal, as was expectable since φ ∈ V0 and ψ ∈ W0 , which is
the orthogonal complement of V0 .
Figure 5.9 Sketch of the computation process of the fast wavelet transform using filter banks.
This operation is essentially a change of basis. The first term on the RHS is referred
to as approximation (it is indeed the best approximation of f (t) at the resolution level
of V0 ), and the second term is the detail, i.e., the part of f that is missing to obtain f1 .
As a general result, extending this procedure to the subsequent levels,
Õ ÕÕ
f (t) = α0,n φ0,n (t) + βm,n ψm,n (t), (5.32)
n∈Z m∈N n∈Z
i.e., each function f (t) can be expressed as the sum of the approximation at the
resolution level V0 and a sequence of details at the levels m ≥ 0.
This process is normally implemented as a two-channel multi-rate filter bank
(Mallat 1989), which convolves the signal with a low-pass filter g[n] and a high-
pass filter h[n]. The high-pass filter retains the details at the scale m, while the low-
pass filter contains the approximation. Both descriptions are subsampled by a factor
of 2; then the low-pass filtered signal is again passed through the high- and low-pass
filters to obtain the approximation and details at the subsequent scale. At the end of
the process, the DWT delivers an approximation at the level 0 and a set of details at
different scales. A sketch of this process is shown in Figure 5.9.
In order to establish the connection between DWT and filter banks, consider a
signal x[n] of N samples, and the simplest two-channel filter bank, composed by
a low-pass filter that performs a moving average of the values of the signal, and
a high-pass filter that computes the differences. The low-pass filter has coefficients
g[0] = g[1] = 1/2, while the high-pass filter has coefficients h[0] = −h[1] = 1/2. The
two filters separate the frequency components of the signal x into two bands, low and
high frequencies. In matrix form:
1 1 x0
1 1 1 x1
Gx = x2 , (5.33)
2 1 1
.
..
..
.
1 −1 x0
1 1 x1
−1
Hx = . (5.34)
x2
2 1 −1
..
... .
1 1
1 1 1
(↓ 2)G = √ , (5.35)
2
...
1 −1
1 1 −1
(↓ 2)H = √ . (5.36)
2
...
This operation can be applied recursively, as outlined in Figure 5.9, i.e., the
low-pass filtered part is passed again through the low- and high-pass filter, and
downsampled, and so on. But how does this relate to wavelets and MRA? The answer
is simple: the MRA is performing exactly the same operation, filtering (through φ)
and downsampling. The reader should now start to glimpse the relation with the
Haar wavelet and the simple two-channel filter bank proposed here. From (5.23),
considering the coefficients in (5.30),
φ(t) = φ(2t) + φ(2t − 1), (5.37)
and, in general, for discrete signals and for the levels j and j − 1,
1
φ j−1,n = √ [φ j,2n + φ j,2n+1 ]. (5.38)
2
For the signal x of N samples, this would result in N/2 coefficients, with the same
output of applying the process of filtering and downsampling (it can be checked easily
that the operation can be written in matrix form exactly as in (5.35)).
Similarly, from (5.27),
1
ψ j−1,n = √ [φ j,2n − φ j,2n+1 ], (5.39)
2
which is equivalent to (5.36).
This is the celebrated Mallat’s pyramid algorithm, which is based on decomposing
a function at a certain resolution level as the sum of approximation and detail at a
coarser resolution. One of the main advantages of this algorithm is the efficiency of
its implementation. The fast wavelet transform requires a number of operations that
scales linearly with the number of samples of the signal, thus overcoming in efficiency
the FFT (whose complexity scales with N log2 (N)).
The inverse process (referred as synthesis or reconstruction) is based on upsampling
the signal filling the gaps with zeros, and on convolution with the same filters.
(1)
with xW = [α(1), β(1) ]T and
1 0 0 0 1 0 0 0
1 0 0 0 0 0 0
−1
0 1 0 0 0 1 0 0
1 0 1 0 0 0 −1 0 0
W (1) =√ . (5.42)
2 0 0 1 0 0 0 1 0
0 0 1 0 0 0 −1 0
0 0 0 1 0 0 0 1
0 0 0 1 0 0 0
−1
Notice that the columns of W (1) are orthonormal, and that by construction,
W (1) = [(↓ 2)GT (↓ 2)HT ] .
It is also worth evaluating the energy of the signal and the share of it between
the coefficients:
8
xi2 = 200,
Õ
E=
n=1
Eα(1) = 188,
Eβ(1) = 12,
i.e., 94% of the energy is contained in the approximation coefficients. This
is also a consequence of the signal having a nonzero mean. This means that
we can preserve 94% of the information of the signal by retaining only half
of the wavelet coefficients. This is the cornerstone of signal compression
using DWT.
2. To achieve a 3-level DWT, we have to apply recursively the same procedure
of the previous point to the approximation at each level. This leads to
α(2) = [12, 6]; β(2) = [−2, −2], (5.43)
√ √
α(3) = [9 2]; β(3) = [3 2]. (5.44)
The set of coefficients and the corresponding DWT matrix are as follows:
(3)
√ √ √ √ √
xW = [α(3), β(3), β(2), β(1) ] = [9 2, 3 2, −2, −2, − 2, 2 2, − 2, 0], (5.45)
√1 √1 1
0 √1 0 0 0
8 8 2 2
1
√1 1
0 − √1 0 0 0
√ 2
8 8 2
√1 √1 − 21 0 0 √1 0 0
8 8 2
1
√1 − 21 0 0 − √1 0 0
√
W (3) = 18 8 2
. (5.46)
√ − √1 0 1
2 0 0 √1 0
8 8 2
√1 − √1 0 1
0 0 − √1 0
8 8 2 2
1
− √1 0 − 12 0 0 0 √1
√
8 8 2
1
√1 0 − 21 0 0 0 − √1
√
8 8 2
Again, we notice that 81% of the energy is in the approximation and 90%
in the level 3 (i.e., in only 2 coefficients). Similarly, we retain 94% of the
energy by√retaining only the 3 coefficients with magnitude larger than or
equal to 2 2.
3. The inverse transform can be carried out simply by inverting the previous
matrices. Nonetheless, it is instructive to carry out the process according to
the filter bank implementation, i.e., recursive upsampling and convolution.
For the 3-level decomposition, we first upsample including zeros, and then
use the filters.
(3)
√ (3)
√
α↑2 = [9 2, 0] β↑2 = [3 2, 0], (5.47)
(3) (3)
α̂↑2 = [9, 9] β̂↑2 = [3 − 3], (5.48)
(3) (3)
α(2) = α̂↑2 + β̂↑2 = [12, 6]. (5.49)
At the next level,
(2) (2)
α↑2 = [12, 0, 6, 0] β↑2 = [−2, 0, −2, 0], (5.50)
(2)
√ √ √ √ (2)
√ √ √ √
α̂↑2 = [6 2, 6 2, 3 2, 3 2] β̂↑2 = [− 2, 2, − 2, 2], (5.51)
(2) (2)
√ √ √ √
α(1) = α̂↑2 + β̂↑2 = [5 2, 7 2, 2 2, 4 2]. (5.52)
Finally, for the last level,
(1)
√ √ √ √ (1)
√ √ √
α↑2 = [5 2, 0, 7 2, 0, 2 2, 0, 4 2, 0], β↑2 = [− 2, 0, 2 2, 0, − 2, 0, 0, 0],
(5.53)
(1) (1)
α̂↑2 = [5, 5, 7, 7, 2, 2, 4, 4], β̂↑2 = [−1, 1, 2, −2, −1, 1, 0, 0], (5.54)
(1) (1)
x = α̂↑2 + β̂↑2 = [4, 6, 9, 5, 1, 3, 4, 4]. (5.55)
The interesting exercise of carrying out the DWT after thresholding the
wavelet coefficients is left to the reader.
Time–frequency analysis, and particularly MRA with wavelets, has found numerous
applications in fluid mechanics, especially in turbulence modeling and simulation (see
e.g., Meneveau 1991, Farge 1992, Schneider & Vasilyev 2010). In this section, simple
examples of application of the concepts outlined earlier are proposed, with the target
aimed toward direct application on data analysis, rather than modeling.
Hz
Figure 5.10 Velocity fluctations for Example 3 (top) and corresponding Fourier spectrum
(bottom), normalized by the maximum value.
Figure 5.11 Spectrogram for Example 3, with α = 0.1 s (left) and α = 0.4 s (right). The
spectrograms have been normalized with their corresponding maximum value.
Figure 5.12 Continuous wavelet transform for Example 3. The wavelet coefficients are
reported in absolute value and normalized with respect to the maximum.
The quantitative analysis, with computation of the spectrogram and the scalogram, is
presented in the following example.
Solution
1. The spectrogram, computed with Gabor kernel with α = 0.1 s and α = 0.4 s
is illustrated in Figure 5.11. Once again, it is evident that in the case of the
narrow window (α = 0.1 s) a good time localization is achieved, at the
expense of the frequency resolution. On the other side, with α = 0.4 s, it
is possible to identify some low-frequency features (for example, a peak of
spectral energy is observed at t ≈ 2 s −ν ≈ 2 Hz, which is corresponding
to an intense fluctuation that is clearly identifiable in the raw signal and
with period of approximately 0.5 s). This feature was evidently distorted
for α = 0.1 s, which is not capable of identifying low-frequency features
due to the poor frequency resolution. Indeed the width of the kernel is 0.2 s
if measured at ±α, thus all frequencies below ≈ 5 Hz are not detectable.
This qualitative consideration should be of course quantitatively supported
by computing the impulse response of the selected Gabor kernel.
2. The corresponding scalogram from the CWT is shown in Figure 5.12.
This can be easily computed with the command cwt in MATLAB and is
generally implemented in libraries for the vast majority of programming
languages. In MATLAB the Morse wavelet is set by default. At low
frequencies, it can be observed that there is a good frequency resolution
but relatively poor time localization. For low frequencies, indeed, wavelets
with large support are used. Flowing along the spectrum of scales, we
clearly see that the temporal localization progressively improves, at the
expense of frequency resolution. This is in line with the conceptual sketch
of the wavelet transform reported in Figure 5.7. Since the signal is of finite
length, a cone of influence bounds the region where edge effects become
relevant, i.e., a region where the scaled wavelet partially extends outside of
the time domain of the raw signal.
Interestingly enough, a visual comparison of Figures 5.11 and 5.12
exemplifies the multi-resolution capabilities of the CWT. In the low-
frequency range, the CWT scalogram and the spectrogram with α = 0.4 s
share significant similarities; for frequencies in the range 5 − 20 Hz, there
is more agreement with the spectrogram with α = 0.1 s. This comparison
highlights the potential of CWT over the WFT: using a scaled version of
the kernel function at different frequencies, it is possible to carry out a
tunable MRA of the signal, depending on the scale to be observed.
The solution is provided in the supplementary material in the form of
MATLAB code.
Figure 5.13 Representation of the two-level DWT coefficients with the Haar wavelet. The
coefficients are arranged in a 500 × 1000 matrix, i.e., with the same format of the data in input.
the set of wavelet coefficients; nonetheless, the coefficients of the signal are typically
rapidly decreasing in magnitude along the spectrum of scale, thus paving to way to
data compression and filtering by retaining only the most important ones. One of the
advantages of using wavelets for this purpose is their capability to preserve edges,
which is one of the most desired features in image compression.
In this example, DWT is used to analyze a flow field and synthesize it after a
thresholding procedure to remove the noise.
The test case is the flow in the wake of three cylinders with diameter D, arranged
with the axis on the vertices of an equilateral triangle with side length equal to 3D/2.
The downstream side of the triangle is orthogonal to the freestream flow, and it is
located at x = 0, centered on the y-axis. This configuration is known as fluidic pinball
(Deng et al. 2020), and is a very interesting test case for flow control applications
(Raibaudo et al. 2020).
An instantaneous flow field from DNS data at Re = 130 (referred to as a chaotic
regime (Deng et al. 2020)) is considered as a reference signal for the following
example. For simplicity, only the streamwise component of the velocity field is
analyzed.
Consider the instantaneous flow field provided in the file pinball 000001.mat,
available in the supplementary material. The file contains the matrices U and
V, being respectively the streamwise and crosswise velocity fields on a 500 ×
1000 points grid (obtained after interpolation of the data from the original
DNS grid). The grid is stored in the file GridStruc.mat in the form of matrices
X and Y.
1. Superpose to the streamwise velocity field an additive Gaussian noise,
with standard deviation equal to 0.05U∞ (with U∞ being the freestream
Figure 5.14 Comparison of the reference field of a streamwise velocity component, a field
contaminated with noise, and fields filtered via DWT and FFT.
Solution
1. The wavelet coefficients can be easily computed using standard packages
for wavelet calculations, available for a wide variety of programming lan-
guages. The solution presented here uses the MATLAB functions wavedec2,
appcoef2, and detcoef2, which compute, respectively, the full set of coeffi-
cients, the approximation, and the detail coefficients. The approximation
and detail coefficients have been scaled to 8-bit level representation, and
arranged in the same image for visualization purposes. The coefficients of
the 2-level decomposition of the perturbed streamwise velocity field are
reported in Figure 5.13. The image is divided into quadrants: the top-right
and bottom-left are the coefficients representative of the relevant horizontal
and vertical edges, respectively, at the first level; the bottom-right includes
the coefficients of the relevant scales on the diagonal direction at level 1; the
5.5 Conclusions
In this chapter, the fundamentals of time–frequency analysis have been reviewed. The
simple, yet powerful, concepts of scaling and shifting of windows to localize in time
the frequency content of the signal have been introduced in the framework of the WFT.
The consequences of Heisenberg’s uncertainty principle on the localization precision
of features in the time and frequency domains have been outlined.
The MRA with wavelets allows storing approximations of the signal at differ-
ent scales, thus configuring tunable time–frequency resolution. In this chapter, an
overview of the basic concepts behind the CWT and the DWT has been outlined,
together with the description of a procedure to determine a suitable family of
orthogonal wavelets. Some hints for the implementation of the DWT using two-
channel multirate filter banks have also been included.
The multi-resolution capabilities of wavelets are exploited in data compression
(e.g., the JPEG2000 algorithm is based on wavelets), denoising, speech recognition,
finance, and several areas of physics (Van den Berg 2004). In general, all signals
arising from multiscale processes are prone to effective use of wavelets. Turbulence
is clearly part of this paradigm. In the first age of wavelet use in turbulence, the
main applications were mostly (but not exclusively) targeted to turbulence model-
ing, efficient methods for computational fluid mechanics, and coherent structures
segmentation. In the coming years, it is foreseeable that the interest in multiscale
time–frequency methods will be further fostered by the current trend toward modal
decompositions blending energy optimality and spectral analysis (see e.g., Towne et al.
2018, Mendez, Balabane & Buchlin 2019, Floryan & Graham 2021).
h d i 0 0 9 8 088962 4 0 0 bli h d li b C b id i i
h d i 0 0 9 8 088962 4 0 0 bli h d li b C b id i i
6 The Proper Orthogonal
Decomposition
S. Dawson
The proper orthogonal decomposition (POD) is one of the most ubiquitous data
analysis and modeling techniques in fluid mechanics. Since many of the properties of
the POD are inherited from the singular value decomposition (SVD), we start with a
discussion of the SVD and describe those of its properties that are particularly useful
for understanding the POD. Our discussion of the POD starts by characterizing the
POD as a decomposition that is specific to a given data set before discussing how the
same concept arises when considering dynamical systems that are continuous in space
and time. We will describe several variants of the POD that have emerged over the
last half a century and how they are related, such as spectral and space-only POD. As
well as giving a broad overview, we will discuss some of the technical details often
omitted and/or taken for granted when POD is applied in practice, such as how to
incorporate nonstandard inner product weights. We finish by using a simple example
to demonstrate properties and methods of implementation for the POD.
6.1 Introduction
The POD is one of the most widely used techniques for data analysis in fluid
mechanics. Over the past 50 years, its usage in the community has grown and evolved
alongside developments in experimental measurement techniques, the introduction
and rapid development of computational methods for simulating fluid flows, theoreti-
cal developments in dynamical systems, and the ability to store and process increasing
quantities of data feasibly. We now discuss the broad motivation for POD and related
techniques. Suppose we have data that is a function of both space and time y(x, t).
A logical first step toward understanding the salient features of the data, and perhaps
the dynamics of the underlying system responsible for its generation, is to perform a
separation of variables:
m
Õ
y(x, t) = φ j (x)a j (t). (6.1)
j=1
That is, we decompose the data into a sum of spatial modes φ j (x), and their time-
varying coefficients (or equivalently, temporal modes) a j (t). There are many choices
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
118 Data-Driven Fluid Mechanics Chapter 6
for such a decomposition. For example, one might choose to perform a Fourier
transform in space or time, thus obtaining a Fourier basis. Rather than using a
predefined basis, POD chooses a decomposition based on the data itself, though we
will see later that in certain cases such data-driven modes can actually converge to
Fourier modes. Note that we will also see that this decomposition can be further
generalized to allow for time-dependency in the spatial structures.
Since many of the properties of POD are inherited from the SVD, we start with a
discussion of the SVD and its properties more generally, before discussing how it can
be utilized to define and compute POD of a given data set.
In short, POD can be viewed as the result of taking the SVD of a suitably arranged data
matrix. Because of this, and as many of the properties of POD are inherited directly
from those of the SVD, we start with a discussion of the SVD and its properties more
generally, before discussing how it can be utilized to define and compute POD upon a
given data set. This section might seem like a lot of rather dry mathematics, but it will
all end up being useful later in this chapter.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 119
That is, the adjoint is what you get if you move an operator to the other side of an
inner product. As an aside, note that (6.3) can be very useful in its own right: if one
of m or n is much smaller than the other, then it can be substantially easier to evaluate
this inner product in the lower-dimensional space. Note also that this definition of an
adjoint reduces to the conjugate transpose matrix in the case where the inner products
are the “standard”, that is,
Optimality
The best rank-r approximation to a matrix X (in a least-squares sense) is obtained by
taking the first r components of the SVD of X. That is, if Xr is a rank-r approximation
of X, we have
argmin k X − Xr kF = Φr Σr Ψr∗ , (6.5)
Xr
where Φr and Ψr are the first r columns of Φ and Ψ in the SVD of X, and Σr is
a diagonal matrix consisting of the first r singular values of X. Here the subscript F
denotes the Frobenius norm, given by
v v
u
u
tÕ n Õ m tmin(n,m)
u
2 Õ
k X kF = X jk = σj2, (6.6)
j=1 k=1 j=1
which is the same as the Euclidean norm if we were to squeeze the entries of the
matrix X into a single long vector. As will be seen later, this property is responsible
for the “optimality” of the POD. We also observe from the right inequality in
(6.6) a direct connection between the Frobenius norm and the singular values of a
matrix. Intuitively, this connection exists because the singular values contain all of
the information about the size of the components in the matrix X, with the Φ and
Ψ matrices containing orthonormal vectors. Looking again at the quantity that is
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
120 Data-Driven Fluid Mechanics Chapter 6
minimized on the left-hand side of (6.5), this connection between the Frobenius norm
and the SVD also leads to
v
u
tmin(n,m)
u
Õ
min k X − Xr kF = σj2 . (6.7)
Xr
j=r+1
As an additional aside, note that we could also consider the L2 operator norm of a
matrix, defined by
k X k2 = max k Xψk2 = σ1, (6.8)
kψ k2 =1
with the second equality arising because the unit vector ψ that maximizes k Xψk2 is
the first right singular vector of X, ψ 1 . Moreover, note that (6.5) also holds when the
Frobenius norm is replaced by the operator L2 norm. In this case, the equivalent of
(6.7) is
min k X − Xr k2 = σr+1 . (6.9)
Xr
X ∗ Φ = ΨΣΦ∗ Φ = ΨΣ.
Relationship to eigendecompositions
The left and right singular vectors are eigenvectors of the matrices X X ∗ and X ∗ X
respectively, with
X X ∗ φ j = σj2 φ j , (6.12)
X Xψ j =
∗
σj2 ψ j . (6.13)
These relationships readily follow from (6.10) and (6.11). Note also that the cor-
relation matrices X X ∗ and X ∗ X are both Hermitian,2 since (X X ∗ )∗ = X X ∗ and
(X ∗ X)∗ = X ∗ X.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 121
XQ = ΦΣ(ΨQ)∗ . (6.15)
Dyadic expansion
The SVD can be expanded as a sum of rank-1 matrices via
q
Õ
X = ΦΣΨ = ∗
σj φ j ψ ∗j . (6.16)
j=1
where as before z j and y j are rows and columns of the matrix, respectively. In order
for these to satisfy the definition of an inner product, the weight matrices Wm and
Wn must be positive definite. Note that the adjoint of X with these inner products is
given by
X ∗ = Wm−1 X † Wn, (6.18)
hy, X z, i n = z † X † Wn y
= z † Wm Wm−1 X † Wn y
= (Wm−1 X † Wn )y, z m
:= hX ∗ y, zi m .
Wn and Wm are often positive diagonal matrices, where the diagonal entries contain
integration weights. Henceforth, we assume that this is the case, which makes the
definition of their square roots Wm1/2 and Wn1/2 unambiguous. The SVD of X with
these inner products can be computed as follows:
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
122 Data-Driven Fluid Mechanics Chapter 6
With this computation, it is easy to verify that the identified singular vectors are
orthonormal with respect to their respective inner products (Φ† Wn Φ = I, Ψ†Wm Ψ =
I), and that they satisfy the eigenvalue problems discussed in Section 6.2.2 for the
general inner products and adjoints:
Xw Xw† (φw ) j = σj2 (φw ) j ,
(Wn1/2 XWm−1/2 )(Wm−1/2 X † Wn1/2 )(Wn1/2 φ j ) = σj Wn1/2 φ j ,
X(Wm−1 X † Wn )φ j = σj2 φ j ,
X X ∗ φ j = σj2 φ j ,
and similarly
Xw† Xw (ψ w ) j = σj2 (ψ w ) j ,
(Wm−1/2 X † Wn1/2 )(Wn1/2 XWm−1/2 )(Wm1/2 ψ j ) = σj Wm1/2 ψ j ,
(Wm−1 X † Wn )Xψ j = σj2 ψ j ,
X ∗ Xψ j = σj2 ψ j ,
where here (φw ) j and (ψ w ) j are the jth columns of Φw and Ψw , respectively.
Suppose that we measure data that is a function of both space and time, and assemble
all data into a matrix, such that each column of the matrix represents a “snapshot” of
all data that is measured at a given instance in time, while each row consists of all
of the measurements of a given quantity across all times. For example, if we measure
two components of the velocity of a fluid (u and v) at spatial locations x1, x2, . . . , xn ,
and at times t1, t2, . . . , tm , then the data matrix becomes the n × m matrix
u(x1, t1 ) u(x1, t2 ) ··· u(x1, tm )
u(x2, t1 ) u(x2, t2 ) ··· u(x2, tm )
. .. ..
.. ..
. . .
u(xn, t1 ) u(xn, t2 ) ··· u(xn, tm )
X= , (6.19)
v(x1, t1 ) v(x1, t2 ) ··· v(x1, tm )
v(x2, t1 ) v(x2, t2 ) v(x2, tm )
···
.. .. .. ..
. . . .
v(xn, t1 ) v(xn, t2 ) ··· v(xn, tm )
where n = 2nx , since there are two measurements for each spatial location. Note
that we are not making any assumptions concerning the spatial or temporal resolution
of our data, or in what quantity or quantities are being measured. The only implicit
assumption that we are making in forming X is that we measure the same quantities
for each snapshot. We next subtract from each column of X a base condition y0 . Most
typically, this is the mean of all of the columns in X, though it need not always be.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 123
If X represents the transient response of a system to a given input, it would make sense
to consider the steady or average state as t → ∞ instead of the mean of the early-time
data. For notational convenience, we will refer to this base state that is subtracted from
the data as the “mean” in any case. Denoting the kth column of X by yk (x), we now
form the mean-subtracted matrix
where y j0 = y j − y0 is the jth mean-subtracted snapshot. Most simply, the POD can be
found from taking the SVD of Y . In particular, if we have
Y = ΦΣΨ∗, (6.21)
then POD modes are given by the columns φ j (x) of Φ, which evolve in time via
the corresponding coefficients a j (t) = σj ψ j (t)∗ (and for the case of real data with
a standard inner product, ψ j (t)∗ = ψ j (t)T ). That is, with reference to (6.1), a POD
expansion of the data is given by the dyadic expansion of the SVD as described in
Section 6.2.2:
Õq
yk (x) = φ j (x)σj ψ ∗j (tk ). (6.22)
j=1
Since we have arranged our data such that spatial location depends only on the row
number, and temporal location depends only on the column, the left and right singular
vectors depend only on space and time, respectively, allowing this direct use of the
SVD to obtain this separation of variables in our data. The optimality property of the
SVD described in Section 6.2.2 means that a truncation of this sum to r terms gives
the closest rank-r approximation to the full data set.
When assembling the data matrix in (6.19), we chose to place the measurements of
the v-component of velocity below all of the measurements of u, though we could have
also arranged the data in other ways, for example, with alternating rows containing the
u and v components of velocity. From the permutation properties of the SVD discussed
in Section 6.2.2, changing the arrangement does not affect the decomposition, aside
from permuting the entries in the POD modes to the appropriate locations. Similarly,
we can rearrange the columns of the data matrix without affecting the POD modes
at all. Indeed, this fact can be used to demonstrate that POD does not require time-
resolved data, given that shuffling the order of time-resolved snapshots can give a
non-time-resolved sequence of snapshots.
From Section 6.2.2, the right singular vectors of Y are eigenvectors of the matrix
Y Y . This m × m matrix can be interpreted as the correlation matrix between all
∗
snapshots:
y10 , y10 x y2, y1 x · · · y , y1 x
0 0
0 0
m
y ,y y2, y2 x · · · 0 , y0
0 0
0 0
1 2 x ym 2 x
Y Y=
∗
. . .. . , (6.23)
. .. . ..
.
y 0, y 0 y2, y m x · · ·
0 0
y m, y m x
0 0
1 m x
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
124 Data-Driven Fluid Mechanics Chapter 6
where h·, ·i x denotes a spatial inner product. Similarly, if we let z 0j (t) be a row vector
that is the jth row of Y , we have
z 10 , z 10 t z ,z z ,z
0 0
0 0
···
0 0
10 20 t
10 n0 t
z ,z z 2, z 2 t · · · z 2, z n t
2 1 t
YY ∗ = .. .. .. .. , (6.24)
.
.
. .
z0 , z0 z n0 , z 20 t · · · z n, z n t
0 0
n 1 t
where h·, ·i t is an inner product in the time dimension. Again from Section 6.2.2,
the POD modes φ j are eigenvectors of the matrix YY ∗ . This n × n matrix can
be interpreted as the time-correlation matrix between pairwise rows in Y .
Note that the two matrices YY ∗ and Y ∗ Y are generally of different sizes, with
YY ∗ being n × n and Y ∗ Y being m × m. Since the SVD of Y is related to the
eigendecompositions of these square matrices, it might be easiest to compute and work
with the smaller of these two matrices. For example, if one had very large snapshots
but comparatively few of them, then m n, and so it can be more tractable to compute
the (full or partial) eigendecomposition of Y ∗ Y to obtain the POD coefficients a j (t).
The POD modes can then be computed using φ j = σj−1Y a j , following the properties
of the SVD in Section 6.2.2. This is precisely the “method of snapshots” for computing
POD introduced in Sirovich (1987). Conversely, if n m, then one can instead start
by computing an eigendecomposition of YY ∗ .
Up until this point, we have viewed the POD as being something that is computed
from a given data set. In reality, however, if our data comes from measurements of
a given system, then the POD should ideally be independent of the exact data that
we have collected and used for analysis. That is to say, we can alternatively think of
POD as being a property of a system, which we hope to accurately compute from data.
Looking back at the inner products of discrete data that feature in (6.23) and (6.24),
these can be viewed as discrete approximations to the integrals
D E ∫
y j0, yk0 = y10 (x, t j )y20 (x, tk )dx, (6.25)
x Ωx
D E ∫
z 0j , z k0 = z 10 (x j , t)z 20 (xk , t)dt, (6.26)
t Ωt
where the first equation is the spatial integral over the appropriate number of spatial
dimensions, and Ωx and Ωt define the spatial and temporal domains of interest.
Note that a desire to correctly approximate these integrals can lead to the need to
include integration weights for discrete data as described in Section 6.2.2.3 Note that,
3 To be more general, we also could have included an additional weight function in these integrals, but we
omit this for simplicity.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 125
where E is the expectation, taken over time and/or ensembles. The POD in a
continuous setting amounts to finding modes φ j (x) that are eigenfunctions of the
integral equation
∫
C(x, x 0)φ j (x 0)dx 0 = σj2 φ j (x). (6.28)
Ωx
As is perhaps indicated by the notation used, the eigenvalues σj2 are real and positive,
and this equation is the continuous equivalent to (6.12). Note that it can be shown that
many of the same properties of the POD discussed in the discrete, finite-dimensional
context, such as orthogonality and optimality, also extend to equivalent concepts in
the continuous setting. In cases where the system is spatially homogenous (i.e., is
invariant under translations), C(x, x 0) = C(x − x 0) is only a function of x − x 0, and it
can readily be shown from (6.28) that the POD modes become Fourier modes. To see
this, if we change the variable of integration to r = x − x 0, (6.28) can be expressed as
φ j (x + r)
∫
C(r) dr = σj2 . (6.29)
Ωr φ j (x)
where now the expectation can be thought of as an ensemble average across different
realizations of a stochastic field for all pairs of space-time coordinates (x, t) and (x 0, t 0).
An equivalent to (6.28) in this space-time setting is then given by
∫
C(x, x 0, t, t 0)φ̃ j (x 0, t 0)dx 0 dt 0 = σj2 φ̃ j (x, t). (6.31)
Ω x ,Ωt
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
126 Data-Driven Fluid Mechanics Chapter 6
Note in particular that the eigenfunctions of this equation are now functions of
both space and time, which we distinguish from the space-only modes that we have
dealt with so far using a ˜·. Most typically, systems modeled using this approach are
stationary in time, meaning that C(x, x 0, t, t 0) = C(x, x; , t − t 0) is only a function of the
difference in time τ = t − t 0. Similar to the case of spatial homogeneity, this means
that the temporal dependence on these modes become Fourier modes. This allows for
a temporal Fourier transform of (6.31), giving
∫
S(x, x 0, f )φ̂ j (x 0, f )dx 0 = σj ( f )2 φ̂ j (x, f ), (6.32)
Ωx
where the cross-spectral density tensor S(x, x 0, f ) is obtained from the temporal
Fourier transforms:
∫ ∞
S(x, x , f ) =
0
C(x, x 0, τ) exp(−i2π f τ)dτ. (6.33)
∞
This space-time or spectral POD was, in fact, the original formulation of POD as
described in Lumley (1967) and Lumley (1970). However, the “space-only” POD
has been much more commonly used since then, likely due to two main reasons:
it is easy to formulate and compute from data, and it is readily amenable for the
construction of projection-based reduced-order models (e.g., Holmes et al. 2012), as
will be discussed in Section 6.5. Indeed, the difference between the original and most
commonly used formulations has perhaps been under appreciated in the community
(including by the author). Recently, the work of Towne et al. (2018) has brought the
“spectral” version of POD back to prominence, though an exposition of the method
and its close connections to other modeling approaches, and in the development of
practical algorithms for its implementation (Schmidt & Towne 2019).
Perhaps the most common usage of the POD involves using a truncated basis of POD
modes to obtain a subspace that differential equations describing the physics of a
system may be projected onto. For example, if we consider the incompressible Navier–
Stokes momentum equation
∂u
= −(u · ∇)u + Re−1 ∆u − ∇p, (6.34)
∂t
then taking the spatial inner product of both sides of this equation with a POD mode
φ j (x) gives
∂u
, φ j = − (u · ∇)u, φ j x + Re−1 ∆u, φ j x − ∇p, φ j x . (6.35)
∂t x
Approximating the full velocity field with a set of POD modes (as in (6.1)), we obtain
aÛ = L a + Q(a, a) + f , (6.36)
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 127
There are many modifications and extensions of the basic POD definition and
algorithm that can improve on its suitability, accuracy, and efficiency in various
applications. If being used for reduced-order modeling as described in Section 6.5,
a subspace based on POD modes is optimal from an energetic sense, but is not
necessarily the best choice for retaining all features of dynamic importance. For the
case of linear systems, one can seek to define the subspace and direction of projection
such that they best preserve the system dynamics (quantified by the observability and
controllability Gramians). This is known as balanced truncation (Moore 1981), and it
was shown by Rowley (2005) that these subspaces can be identified from POD using
inner products that are weighted by the observability and controllability Gramians (see
also the similar method developed in Willcox and Peraire (2002)). Further discussion
of notions of these concepts are given in Chapter 10.
Several other variants of POD account for other desired qualities in spatio-
temporal decompositions, such as finding a balance between energetic optimality and
frequency localization (Sieber et al. 2016), and seeking scale separation (Mendez
et al. 2019). Additional variants aim to reparameterize the spatio-temporal domain
with translations, rotations, and/or rescalings to make certain data/systems (often
those dominated by convection) more amenable to low-dimensional approximation
(Rowley & Marsden 2000, Rowley et al. 2003, Mojgani & Balajewicz 2017, Mowlavi
& Sapsis 2018, Reiss et al. 2018, Black et al. 2019, Mendible et al. 2020).
Aside from these variants, computation of POD can require overcoming several
practical challenges involving the acquisition and processing of data of sufficient
quantity and quality to ensure converged results. For further discussion of such
practical considerations for experimental data in particular, see Chapter 9.
6.7 Examples
This section will present two simple examples to show how the POD can be applied,
and its outputs interpreted. Both of these examples come with accompanying Python
code, available on the book’s website.4 The example considered here is similar to an
4 www.datadrivenfluidmechanics.com/download/book/chapter6.zip
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
128 Data-Driven Fluid Mechanics Chapter 6
y(x, t)
2.0 1.6
1.2
Figure 6.1 Surface plot of
1.5 0.8
data from (6.38).
0.4
1.0 0.0
t
–0.4
0.5 –0.8
–1.2
0.0 –1.6
–2 –1 0 1 2
x
103
10 –9
10 –12
0 2 4 6 8 10
Mode index
example considered in (Tu et al. 2014). We consider data that is sampled from the
function
" # " #
(x − x1 )2 (x − x2 )2
y(x, t) = c1 exp sin(2π f1 t) + c2 exp sin(2π f2 t), (6.38)
2s12 2s22
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 129
0.10 Mode 1
POD mode shapes Mode 2 Figure 6.3 POD mode
shapes computed from data
0.05 from (6.38).
0.00
–0.05
–2 –1 0 1 2
x
0.10 Mode 1
Mode 2 Figure 6.4 POD mode
Mode coefficients
0.00
−0.05
−0.10
0 1 2 3 4
t
However, the leading two POD modes do not cleanly partition the two component
functions, and instead each mode is a mix of the two. This is due to the fact that the
SVD is “greedy” in the sense that the first set of singular vectors captures as much
energy as possible, which in this case means taking part of each of the two terms in
(6.38).
In Figure 6.5, we show the results of applying spectral POD to the same data set,
which we see is able to distinguish between the two spatial functions corresponding to
different temporal frequencies, both of which show up as peaks in the corresponding
spectrum, shown in Figure 6.6. Note that here a simple Fourier transform in time
would also suffice, since the data is entirely deterministic and free of noise. Here we
could also separate out the frequency content using dynamic mode decomposition on
appropriately arranged data.
The next example that we consider involves data from a fluid simulation, consid-
ering two-dimensional flow over a circular cylinder at a Reynolds number of 60.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
130 Data-Driven Fluid Mechanics Chapter 6
200
0
–2 –1 0 1 2
y
0
–2 –1 0 1 2
x
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Dawson: The Proper Orthogonal Decomposition 131
Mode 1, u
Mode 2, u
This phase difference between both the streamwise-oscillating structures in space and
coefficients in time is approximately π/2. This can be reasoned from the fact that two
purely oscillatory signals at the same frequency will only be orthogonal if they are
exactly a quarter-cycle out of phase. These mode shapes, which can be thought of
as standing waves, allow for the representation of structures that convect downstream
over time.
Lower-energy modes (not shown) include modes that oscillate at harmonic frequen-
cies of these leading modes, as well as modes that capture the distortion of the base
flow as the system moves between the unstable equilibrium and limit cycle of vortex
shedding (as discussed in Chapter 1). Note that had we considered data for a different
section of the transient (or on the limit cycle), these leading two POD modes would
be slightly different, though would share the same qualitative features. In other words,
on this transient trajectory, the system is not statistically stationary, so the POD that
we obtain is a function of the data set that we use.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
132 Data-Driven Fluid Mechanics Chapter 6
Mode 1
1.0
Mode coefficient s Mode 2
0.5
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
7 The Dynamic Mode
Decomposition: From Koopman
Theory to Applications
P. J. Schmid
7.1 Introduction
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
134 Data-Driven Fluid Mechanics Chapter 7
precise meaning of “coherence” has to be defined anew, and often has to be tailored to
the specific circumstances of the data and/or flow configuration.
The data-centric approach carries both advantages and disadvantages. It is simpler
to justify, particularly in an experimental setting, since the gathered data – by
definition – is observable; model-based computational approaches, on the other hand,
can suffer from robustness issues. Still, even in data-driven analysis, we can encounter
hidden (and sometimes high-dimensional) variables that are essential from a physical
point of view, but are difficult to measure or absent in our data. Data are often
noisy, which can pose challenges to the subsequent analysis (see also Chapter 13).
The uncertainty in the data has to be taken into account and balanced against both
further approximations in the decomposition. In particular, the uncertainty in the
output variables has to be determined and accounted for in the final interpretation
of the results.
Many common techniques in describing processes, based on the data they produce,
are based on spectral analysis (in the most general sense of the word). This involves a
transformation into another basis, during which some of the independent variables
(time, space) will be replaced by their dual equivalents (frequency, wavenumber).
Ideally, this transformation introduces a sparse, localized, and hierarchical description
of the physical process (contained in the data sequence) in the new basis – and allows
a more concentrated representation of a potentially complex process.
Among various decompositions, modal decompositions (in a general sense) are
most common (Lumley 1970, Berkooz et al. 1993, Holmes et al. 2012, Taira
et al. 2017). They are the data-driven equivalent of a separation-of-variables approach:
we factorize the data into purely spatial components (the modes), purely temporal
components (the dynamics), and the associated amplitudes (the spectrum); see also
Chapter 8 for additional material on linear decompositions. This procedure is predi-
cated on a linear assumption: we can recover the original signals by a superposition
of (some of) the identified mode–dynamics–amplitude triplets. For this hierarchical
breakup and reassembly, a great many decompositions are available; for nonlinear
systems, or a nonlinear analysis, the options are far more limited.
A promising way forward seeks to postulate and design a coordinate transforma-
tion, under which the nonlinear dynamics reduces to a linear one. This approach
has been studied from a mathematical point of view for many decades, but as an
exact transformation has found only limited applicability to a few nonlinear partial
differential equations within a model-based setting.
Koopman analysis falls within this latter category, as it proposes a nonlinear
coordinate transformation that embeds a nonlinear finite-dimensional dynamical
system into an equivalent linear system, albeit of infinite dimensions. The remainder
of this chapter will motivate data analysis from this point of view (Lasota & Mackey
1994, Mezić 2005, Budišić et al. 2012, Mezić 2013) and present a computational
framework for the decomposition of data sequences generated by a nonlinear system.
The link to Koopman theory is also developed in Chapter 10, and the reader is urged
to consult this material.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 135
F F
qn qn+1 qn+2 (State space)
K K
φn φn+1 φn+2 (Observable space)
Figure 7.1 Schematic of the Koopman idea. We have used the abbreviaton φn = φ(qn ).
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
136 Data-Driven Fluid Mechanics Chapter 7
The choice of observables under which our nonlinear dynamical system is trans-
formed into a linear system is the key issue in Koopman analysis. In effect, we
are looking for a coordinate transformation or an embedding φ(q) that renders the
evolution in φ linear. Two examples shall illustrate the main ideas of finding these
Koopman embeddings (Tu et al. 2014, Brunton & Kutz 2019).
The first example (Tu et al. 2014) is given by the two-dimensional nonlinear map
xn+1 = axn, (7.3a)
yn+1 = b(yn − xn2 ). (7.3b)
We choose the observable vector φn = (xn, yn, xn2 ).
Simple algebra confirms that a
linear mapping K links the observable vector φn to its successive realization φn+1
according to
© xn+1 ª ©a 0 0 ª © xn ª
yn+1 ® = 0 b −b® yn ® . (7.4)
2 2 2
® ® ®
« x n+1 ¬ « 0 0 a ¬ « xn ¬
| {z } | {z } |{z}
φ n+1 K φn
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 137
where ⊗ denotes the Kronecker product, I stands for the n × n identity matrix, and the
matrix C represents the (discrete) temporal dynamics.
In what follows, we consider the original data sequence as observables. Juxtaposing
the previous final expression with a general decomposition of a data matrix D,
consisting of a temporal sequence of measurements/observable of our dynamic
process arranged in columns, we seek
time time
= B
Amplitudes
C
Dynamics
D A
Data Modes
from which we can identify the matrix A as containing the Koopman eigenvectors
Φ, the matrix B as containing the amplitudes, and the matrix C as containing the
temporal dynamics. The diagonality of B decouples the decomposition into rank-one
components. In addition, expression (7.6) establishes the matrix C as a Vandermonde
matrix, containing the eigenvalues of K, with diag({λ}) = Λ. We have
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
138 Data-Driven Fluid Mechanics Chapter 7
where n denotes the number of snapshots (columns) in the data matrix D. In other
words, given a data matrix D ∈ Cm×n we have to find a decomposition into three
factors, a general matrix A ∈ Cm×n , a diagonal matrix B ∈ Cn×n, and a Vandermonde
matrix C ∈ Cn×n .
D = D0S. (7.10)
We take advantage of the companion structure of S to deduce the matrix D0 and the
last column of matrix D. The subdiagonal of ones links the j + 1th column of D0 to
the jth column of D for j = 1, . . . , n − 1. This part of the companion matrix S is
thus responsible for an index shift for the columns of the two matrices D and D0 . In
other words, the columns in D consist mostly of columns of D0 . The last column of S
(containing the coefficients a j ) states that the nth column of D is a linear combination
of all the columns of D0 . In the end, the columns of D are the backshifted columns of
D with the final column a linear combination of all columns of D0 .
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 139
D1 ,
n
D1n+1 → (7.11)
Dn+1,
2
j
where we have introduced the notation Di to represent a sequence of measurement
j
from snapshot i to snapshot j, that is, Di = {di , di+1, . . . , d j }, with d denoting the
measurement (observable) vector. Equation (7.10) then suggests to express the last
column of D2n+1 , that is, the observable vector dn+1 as a linear combination of all
columns of D1n . We accomplish this by formulating a least-squares problem, and using
a QR-decomposition of D1n, we arrive at
that is, the companion matrix S. The superscript H denotes the Hermitian (conjugate
transpose) operation. We then have the relation
This latter expression states that the action of the Koopman operator K on the data
set of observables D1n, pushing each snapshot forward over one time step ∆t, can
be expressed on the basis of all gathered snapshots using the companion matrix S
(Rowley et al. 2009). While K scales with the (high) dimensionality of the observer
vectors, the matrix S scales with number of gathered snapshots. Equation (7.13)
is reminiscent of an Arnoldi iteration, where the action of a large matrix on an
orthonormal set of vectors (from a Krylov subspace sequence) is expressed on the
same basis. As a consequence, any spectral information we wish to gather about
the operator K can be gathered, in an approximate manner, from the operator S. For
example, a subset of eigenvalues of K can be approximated by the eigenvalues of S.
Determining S via a least-squares approximation, as proposed earlier, can quickly
become ill-conditioned. For long data sequences, or data sequences that are charac-
terized by a noticeable amount of noise or uncertainty, the matrix D1n can become
rank-deficient. This is particularly true when data redundancy or near data redundancy
arises. For example, sampling a wake flow past a bluff body will eventually fail to
produce new data; instead, the repeated shedding of coherent structures will cause the
matrix to quickly decline in rank, once a few full shedding cycles have been sampled.
In this case, the matrix R from the QR-decomposition of D1n is no longer invertible.
For this reason, a more robust algorithm must be implemented.
While keeping with the general premise of the above least-squares problem, we
switch to solving it for the rank-deficient case, using a singular-value decomposi-
tion (Schmid 2010). We then have
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
140 Data-Driven Fluid Mechanics Chapter 7
1 given a snapshot sequence {d1, d2, . . . , dn+1 } sampled equispaced in time with ∆t
4 UΣV H ← D1n
5 S̃ ← U H D2n+1 VΣ+
6 [X, Λ] ← eig(S̃)
7 µ j ← log(Λ j j )/∆t
8 Φ j ← UX(:, j)
Figure 7.2 SVD-based dynamic mode decomposition algorithm. The matrix Φ represents the
matrix A in our general decomposition D = ABC. The eigenvalues Λ can be recast into the
Vandermonde matrix C of our general decomposition.
1/σj σj ≥
for
S̃ = U H
D2n+1 VΣ+
with UΣV = H
D1n, Σ+
= diag ,
0 σj <
for
(7.14)
which applies a Moore–Penrose pseudo-inverse to the least-squares problem. We have
introduced a threshold in the pseudo-inverse to signify a cutoff value for considering
the processed data matrix as rank-deficient. In expression (7.14) for S̃ we have also
projected the dynamics on the singular vectors contained in U, which are equivalent to
the POD-modes of D1n .
We can then state the simple algorithm for computing the DMD from a sequence
of data snapshots, and with it the approximate spectral properties of the Koopman
operator K.
This algorithm accommodates data sequences with a potential degree of redundancy
or near-redundancy, leading to a rank deficiency when processing the data matrix in
step 4 of the algorithm (see Figure 7.2). When forming the pseudo-inverse in step 5, a
threshold value has to be specified and principal vectors of the SVD associated with
singular values below this threshold can be discarded. The logarithmic mapping in
step 7 is commonly applied in hydrodynamic applications to transform from a discrete
to a continuous time variable. The above algorithm is a rather robust technique for
extracting dynamic modes (and approximate Koopman eigenvectors) directly from
a sequence of observable data. It should be mentioned that an additional SVD of
D2n+1 will produce exact eigenvectors of the high-dimensionial operator for the case
of D1n and D2n+1 not spanning the same vector space (Tu et al. 2014). Extensions and
generalizations of the above core algorithm (regarding robustness, accuracy, efficiency
and applicability) will be discussed in Section 7.4.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 141
Once the dynamic structure and their time dependence (contained in the Koopman
eigenvalues λ j ) have been identified, we are still left with the computation of the
amplitudes, that is, the diagonal matrix B of our original decomposition. These
amplitudes will provide information about the importance and dominance of the
identified structures, as they are present in the processed data sequence. Ideally,
structures and their dynamics that are key components in the overall data sequence
should be identified by a larger amplitude, while smaller amplitudes should be
attached to less dominant processes or flow features. For data from high-performance
simulations, this desirable relation between amplitude and relevance can often be
realized; for experimental data and data with a significant amount of noise, it cannot
be guaranteed. In particular, when the processed data sequence is marred by outliers
or other inaccuracies that are fairly localized in time, we may encounter intermittent
structures that are characterized by a large decay rate. This large decay rate accounts
for the fact that the associated flow feature disappears quickly over the time horizon of
the sampled data. If the intermittent structure is of sufficient size, a simple amplitude
algorithm (matching the identified dynamic modes to the original data sequence)
would assign a sizeable amplitude to it, thus giving it importance among the extracted
modes and processes. For a robust and physically meaningful assignment of ampli-
tudes to identified dynamic processes, a more subtle balance between the reproduction
of the original data sequence and the number of participating modes must be struck.
Mathematically and computationally, the recovery of the amplitudes is a nontrivial
step. It is best formulated as a mixed-norm optimization problem. The first component
of the optimization is responsible for enforcing data fidelity of the recovered process.
In other words, we choose the amplitudes of the modes and their dynamics to
best reproduce the original data sequence. This is accomplished by minimizing the
Frobenius norm of the modal expansion and the data sequence (Jovanović et al. 2014).
We denote the vector of amplitudes by b, that is, the diagonal elements of the matrix
B above. We have
kD1n − Φdiag(b)CkF → min, (7.15)
with Φ as the matrix containing the computed dynamic modes Φ j and C as the
Vandermonde matrix containing powers of the identified eigenvalues λ j = Λ j j . Using
a SVD of D1n, the definition of the matrix Φ, and some algebra, the optimization
problem above can be reformulated into the quadratic form
with
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
142 Data-Driven Fluid Mechanics Chapter 7
where ∗ denotes the complex conjugate operation and stands for the elementwise
(Hadamard) product. The amplitude vector b∗ that optimizes the match between the
dynamic mode expansion and the original data sequence is simply given as the solution
to this quadratic optimization problem, namely,
b∗ = P−1 q. (7.18)
This amplitude distribution only minimizes the reconstruction error of the DMD,
taking all available modes – whether physically relevant or linked to outliers in the data
– into account. For a more robust and physically meaningful reconstruction, we have
to restrict the number of modes being considered. We will do this by juxtaposing the
sparsity of the amplitude vector and the achieved data reconstruction error (Jovanović
et al. 2014). These two objectives are in conflict with each other: a very sparse
amplitude vector consisting of only a few nonzero amplitudes will not produce a small
data reconstruction error, while a minimal reconstruction error will not be achieved by
only a few amplitudes. A compromise has to be found that produces an acceptable
reconstruction error with as sparse an amplitude vector as can be managed.
The sparsity of a vector is commonly measured by its `0 -norm or cardinality. The
`0 -norm of a vector simply counts the number of its nonzero components, and is
strictly speaking not a norm (but a quasi-norm). Optimizations based on `0 -norms
are combinatorial in nature and difficult to implement. To this end, we substitute the
`1 -norm of the amplitude vector (i.e., the sum of the absolute values of its components)
as a proxy for its sparsity, and augment the cost functional J (b) to read
with a user-specified parameter γ that quantifies the trade-off between the reconstruc-
tion error J and the sparsity of the amplitude vector kbk1 . This regularization step
is reminiscent of Lasso regression, where a least-squares fit is combined with an
L1 -penalty (Tibshirani 1996). Minimizing the above cost functional yields a lower-
dimensional representation of the full data matrix of the form.
In the augmented cost functional, γ is a positive parameter that quantifies our focus
on the sparsity of the amplitude vector b. Larger values of γ produce sparser solutions,
at the expense of a larger data reconstruction error (see Figure 7.3).
As the value of γ increases, the amplitude vector b becomes progressively
sparser. There is no straightforward link between the number of nonzero amplitudes
(cardinality of b) and the required value of γ to reach this outcome. In practice, one has
to experiment with different values of γ to determine the value for a desired number
of nonzero amplitudes. We notice, however, that the optimization procedure involved
only matrices that scale with the number of snapshots; multiple optimizations with
different values of γ should therefore be computationally feasible.
After we have reached the desired number of amplitudes, while keeping the
reconstruction error minimal, we lock the sparsity structure of the amplitude vector
and solve, once again, an optimization problem, where we optimize the reconstruction
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 143
Time Time
≈ B
Amplitudes
C
Dynamics
D Φ
Data Modes
J (b, γ)
Figure 7.3 Cost functional versus the user-specified parameter γ. Low values of γ result in
lower reconstruction error, but fuller amplitude vectors; larger values of γ lead to sparser
amplitude vectors at the expense of larger reconstruction errors.
error subject to enforcing the identified sparsity structure explicitly. This results in the
convex optimization problem (Jovanović et al. 2014)
J (b) → min, (7.20a)
subject to E b = 0,
H
(7.20b)
where the columns of E consist of unit vectors that identify the zero entries of the final
amplitude vector b, found from the previous mixed-norm optimization.
The mixed-norm optimization problem can be solved by common numerical
techniques, such as ADMM (alternating direction method of multipliers) (Boyd
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
144 Data-Driven Fluid Mechanics Chapter 7
et al. 2011), split Bregman iteration (Goldstein & Osher 2009), or IRLS (iteratively
reweighted least-squares) (Daubechies et al. 2010).
In the supplemental material, a three-step MATLAB
R
code will analyze a snapshot
sequence of PIV data (flow through a cylinder bundle) and extract relevant flow
processes via a sparsity-promoting DMD.
7.5 Applications
A few selected applications shall demonstrate the DMD on experimental and numeri-
cal data sets. The examples are meant to illustrate the algorithm and showcase different
manners of utilizing the flexibility of the methodology in analyzing fluid behavior.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 145
(a)
(b)
(c)
Figure 7.4 Tomographic TR-PIV of a water jet. (a) Representative snapshot (visualized in a
two-dimensional slice through the center plane). (b) The DMD spectrum (superimposed on the
unit disk). (c) Amplitude distribution versus identified frequencies. (a) and (b) from Schmid,
Violato & Scarano (2012), reprinted by permission from Springer Nature.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
146 Data-Driven Fluid Mechanics Chapter 7
(a) (b)
DM2 DM3
DM2
DM3
Figure 7.5 Tomographic TR-PIV of a water jet. Visualization of the two dominant dynamic
modes, (a) visualized by velocity vectors in a two-dimensional slide through the center plane,
and (b) visualized by iso-contours of the Q-criterion.
Koopman operator propagating this observed flow state will contain information about
the interplay between the two processes encapsulated in the chosen data. Eigenvectors
of the Koopman operator that have significant activity in both components may point
toward structures that are “synced” within our data stream and hence may have a
physical link, as they emerge with an identical temporal dynamics.
This processing of composite data is a rather compelling tool to analyze data from
multi-physics processes. While it cannot establish a cause–effect relationship between
the processed fields, it can identify structures (and their frequencies) that are “in
resonance” or arise conjointly. For example, fluid-structure problems, shock-boundary
layer interaction, or the link between coherent structures and their wall-shear stress
footprint can be treated using this composite multivariable DMD technique.
In our case, a DMD analysis of the gathered data sequence from axisymmetric
simulations of the compressible jet reveals a spectrum of eigenvalues (not shown)
with two characteristic, but broad, peaks in frequency. The first, lower-frequency
peaks are dominated by modes with mostly hydrodynamic (vorticity) components,
which concentrate in the center of the jet or on the developing instabilities of the
downstream outer shear layer (see Figure 7.6); the same structures also underscore the
importance of the collapse of the potential core, which appears on the centerline at
an axial distance of x ≈ 12 in the simulations. No significant or very little acoustic
activity in the freestream is visible.
Sampling from the higher-frequency peak, a different picture emerges (see
Figure 7.7). While the hydrodynamic component still concentrates on the center of
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
∇×u
∇·u
Figure 7.6 Composite analysis of a compressible axisymmetric jet at Ma = 0.9. Two dynamic
modes (left and right) from the lower-frequency peak, with the top corresponding to the
hydrodynamic vorticity component and the bottom visualizing the acoustic dilatation
component.
the jet, the sound component of the dynamic modes is clearly visible and captures the
acoustic activity in the freestream. While the first mode (on the left) is dominated by
a nearly omnidirectional radiation of sound, the second mode (on the right) displays a
preferred directionality toward the upstream region. Especially these latter structures
are often responsible for feedback loops that magnify and sustain acoustic activity
and cause concentrated peaks in pressure sound spectrograms.
This example demonstrates the versatility of a Koopman perspective on dynamical
systems, as well as the flexibility of the DMD to explore physical processes from
a data-driven viewpoint. A traditional Poincaré approach based on a state-space
(rather than observable-space) formulation would be difficult to carry out with the
same ease.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
∇×u
∇·u
Figure 7.7 Composite analysis of a compressible axisymmetric jet at Ma = 0.9. Two dynamic
modes (left and right) from the higher-frequency peak, with the top corresponding to the
hydrodynamic vorticity component and the bottom visualizing the acoustic dilatation
component. Significant acoustic activity is observed in both cases.
temporal dynamics and spatial structure may be appropriate for many applications,
there are just as many configurations where an alternative dynamics-structure division
seems more fitting. An example is given by a jet in cross-flow.
A jet in cross-flow (or transverse jet) is generated when a flow exiting an orifice on
a wall is subjected to a second flow parallel to the wall. Situations like this arise, for
example, in smoke stacks or every time a fluid is injected into a cross-stream, such as
in fuel injectors. The corresponding flow is rather complex and develops a great many
instabilities that interact in a nontrivial manner. Among the main instabilities is the
roll-up of the vortex sheet, which initially axisymmetric at the orifice quickly deforms
into two counterrotating vortex pairs that follow a bent trajectory. Along this bent,
rolled up vortex sheet instabilities develop quickly: initiated a short distance from the
orifice, vortex ring-like structures amplify before they break down into smaller and
less coherent fluid elements.
The description of the principal instability for a jet in cross-flow clearly shows
that the evolution of the instability is more aptly expressed as a spatial evolution
along the bent vortex sheet of the base flow. This realization immediately sways the
analysis of this physical process toward a data-driven (Koopman) approach, rather
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 149
than a traditional state-space description that would require the formulation of the
governing equations for the spatial evolution of disturbances along curved paths.
In our data-driven approach, we simply have to sample the flow fields in planes
orthogonal to a chosen evolution direction and stack the data matrix accordingly.
To this end, we determine a streamline emanating from the center of the orifice and
following the base flow. This streamline will serve as our spatial evolution direction.
We then partition the streamline into n = 110 equispaced (in arclength) intervals
between the location of instability onset and the location of ultimate instability
breakdown. For each point along the streamline, we define a plane normal to its
tangent, and interpolate the three-dimensional velocity field into this plane. These
planes (when reshaped into vectors) will form the columns of our data matrix D.
Figure 7.8 illustrates the aforementioned procedure.
Once we have established the evolution direction of our observables and formed the
data matrix, a standard DMD/Koopman analysis can be performed. It is worth noting
that the “spatial” direction, that is, the row direction of the data matrix, contains the
coordinates in the cross-planes (normal to the base-flow streamline tangent) as well as
the time coordinate. In order to reduce the size of the problem, we Fourier transform
in time and focus on the Strouhal number (nondimensional frequency) associated with
the Kelvin–Helmholtz instability of the vortex sheet.
Evaluating the dynamic modes from the analysis of our data matrix, we capture the
spatial breakdown of the vortex sheet. The identified instabilities are localized on the
base-flow counterrotating vortex sheet and describe vortical structures of increasing
complexity and scale; see Figure 7.9.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
150 Data-Driven Fluid Mechanics Chapter 7
The dynamic modes consist of vortical structures that align with the counterrotating
vortex sheet of the base flow. Higher-order modes display more vortical elements that
progressively concentrate on the flanks of the base-flow vortex sheet. This is consistent
with the findings from direct numerical simulations as well as physical experiments.
The amplitude distribution of the modes (not shown) shows a broad peak, which
suggests that a superposition of a few dominant modes may best capture the pertinent
dynamics of the spatial instability and may provide a reduced-order description of the
Kelvin–Helmholtz-type breakdown of the base-flow’s counterrotating vortex sheet.
This final example demonstrates the capability of Koopman/DMD analysis to iden-
tify coherent fluid motion directly from data snapshots in situations where the relevant
evolution direction is not aligned with a coordinate direction, but is instead given by
a curved base-flow streamline. With a projection and rearrangement of the data fields,
the dynamic characteristics of complex instabilities can be readily computed.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
Schmid: The Dynamic Mode Decomposition 151
Since its introduction, the DMD and the underlying Koopman analysis have gained in
popularity as a tool for quantitative flow analysis. Both fields have spawned a great
deal of improvements, extensions, and generalizations – both from an algorithmic and
conceptual point of view. Only a few of them shall be mentioned here.
The choice of observables is a key issue in Koopman analysis. In the above
exposition, we use the raw snapshots, assuming that their evolution contains the
necessary information to embed the dynamics in a linear fashion. However, extended
DMD (Williams et al. 2015) aims at explicitly including observables that are higher
powers of the original snapshots. This follows the original idea of supplementing
the observables by selected nonlinearities. Alas, extended DMD quickly becomes
computationally expensive; a reduction in dimensionality using a kernel trick is
possible (Williams et al. 2015), but only provides temporary relief.
It has quickly been noted that, while equispaced snapshots are advantageous in
the motivation of the DMD algorithm, they are not required for a more general
formulation. In fact, only the snapshots in equal column position in D1n and D2n+1 have
to be separated by a constant ∆t. Within either snapshot sequence, non-equispaced
data can be accommodated.
A SVD is a core element of the DMD algorithm, which allows for a great deal
of improvements from an algorithmic standpoint. An incremental formulation of the
SVD allows the design of an efficient DMD algorithm for streaming data (Hemati
et al. 2017), where new data snapshots get incorporated into an already existing
decomposition by a perturbative adjustment of the modes and spectrum. This type
of algorithm permits the analysis of experimental data in real time. In addition, recent
developments in randomized algorithms for the SVD can also be incorporated into the
overall DMD algorithm and lead to higher efficiencies, while sacrificing little in terms
of accuracy.
Using delay-embedding techniques (related to the Ruelle-Takens methodology),
the predictive horizon of a reduced description of a dynamical system via DMD can
be substantially enhanced. In essence, the data sequence is transformed into a phase-
space description and decomposed there. The data matrix is block-Hankelized and
processed by the standard DMD algorithm (Arbabi & Mezić 2017, Brunton et al.
2017).
Finally, the recent popularity of machine learning and artificial neural network
techniques has also influenced data decompositions and model reduction methods.
By proposing a neural net in the form of an auto-encoder, a nonlinear Koopman
embedding can be learned directly from the data (Takeishi et al. 2017, Lusch
et al. 2018, Mardt et al. 2018, Otto & Rowley 2019, Yeung et al. 2017), by imposing
linearity of the mapping over one time step between the final step of the encoder
portion and the first layer of the decoder part. More advances in this direction are
expected over the coming years.
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
152 Data-Driven Fluid Mechanics Chapter 7
h d i 0 0 9 8 088962 4 0 2 bli h d li b C b id i i
8 Generalized and Multiscale
Modal Analysis
M. A. Mendez
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
154 Data-Driven Fluid Mechanics Chapter 8
We thus look for the smallest basis, that is, the smallest subspace that can still handle
all the information we need. Yet, the definition of “most important” strongly depends
on the application and the experience of the user. In some settings, it is essential to
identify (and retain) information contained within a specific range of frequencies. In
others, it might be necessary to focus on the information that is localized within a
particular location in space or time; in still others, it might be essential to extract some
“coherency” or “energy” or “variance” contribution. More generally, one might be
interested in a combination of the previous, and the tools described in this chapter
give the reader full control over the full spectrum of options.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 155
at hand and is defined by the size of the data and the basis construction criteria. This
is the case of the DFT (see Chapter 4) or the discrete wavelet transforms (DWT, see
Chapter 5). In the first, structures are harmonics with a frequency that is an integer
multiple of a fundamental one; in the second, structures are obtained by scaling and
shifting a template basis (mother and father wavelets).
In data-driven decompositions, the basis is tailored to the data. The most classic
examples are the POD (see Chapter 6) and the DMD (see Chapter 7). Both POD and
DMD have many variants from which we can identify two categories of data-driven
decomposition: those arising from the POD are “energy-based;” those arising from
the DMD are “frequency-based.”
The POD basis is obtained from the eigenvalue decomposition of the temporal or
the spatial correlation matrices. This is dictated by a constrained optimization problem
that maximizes the energy (i.e., variance) along its basis, constrained to be orthogonal,
such that the error produced by an approximation of rank r̃ < R is the least possible.
Variants of the POD can be constructed from different choices of the inner product
or in the use of different averaging procedures in the computation of the correlations.
Examples of the first variants are proposed by Maurel et al. (2001), Rowley et al.
(2004), Lumley and Poje (1997), where multiple quantities are involved in the inner
product. Examples of the second variants are proposed by Citriniti and George (2000)
and Towne et al. (2018), where the correlation matrix is computed in the frequency
domain using time averaging over short windows, following the popular Welch’s
periodogram method (Welch 1967).
Within the frequency-based decompositions, the DMD basis is constructed
assuming that a linear dynamical system can well approximate the data. The DMD
is thus essentially a system identification procedure that aims at extracting the
eigenvalues of the linear system that best fit the data. Variants of the DMD propose
different methods to compute these eigenvalues. Examples are the sparsity promoting
DMD (Jovanović et al. 2014), the optimized DMD (Chen et al. 2012), or the
randomized DMD (Erichson et al. 2019), while higher-order formulations have been
proposed by Le Clainche and Vega (2017). Although the DMD represents the most
popular formalism in fluid mechanics for such linear system identification process,
analogous formulations (with slightly different algorithms) were introduced in the
late 1980s in climatology under the names of principal oscillation patterns (POP,
see Hasselmann 1988, von Storch & Xu 1990) or linear inverse modeling (LIM, see
Penland & Magorian 1993, Penland 1996).
Both “energy-based” and “frequency-based” methods have limits, illustrated in
some of the proposed exercises of this chapter. These limits motivate the need for
hybrid decompositions that mix the constraints of energy optimality and spectral
purity. Examples of such methods for stationary data sets are the spectral POD
proposed by Sieber et al. (2016), the multiresolution DMD (Kutz et al. 2016), the
recursive DMD (Noack et al. 2016), or the Cronos–Koopman analysis (Cammilleri
et al. 2013). A hybrid method that does not hinge on the stationary assumption is the
mPOD (see Mendez et al. 2018, Mendez et al. 2019).
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
156 Data-Driven Fluid Mechanics Chapter 8
All the decompositions mentioned thus far have a common underlying architecture.
This chapter presents this architecture and the formulation of the mPOD. This
decomposition offers the most general formalism, unifying the energy-based and the
frequency-based approaches.
All the modal decomposition introduced in Section 8.1 can be written as a special kind
of matrix factorization. This view allows for defining a general algorithm for modal
decomposition that is presented in Section 8.3. First, Section 8.2.1 briefly reviews
the notation followed throughout the chapter while Sections 8.2.2 and 8.2.3 put this
factorization in a more general context of 2D transforms. Section 8.3 briefly discusses
the link between discrete and continuous domain, which is essential to render all
decompositions statistically convergent to grid-independent results.
3 Recall that here we use a Python-like indexing. Hence the first entry is 0 and not 1.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 157
in Section 8.2.5. The space domain is sampled over a grid (xi , yi ) ∈ Rn x ×ny , with
xi = i∆x, i ∈ [0, nx − 1] and y j = j∆x, y ∈ [0, ny − 1].
In a vector quantity, for example, a velocity field U[u(xi ), v(xi )], we consider that
the reshaping stacks all the components one below the other producing a state vector
of size ns = nC nx ny , where nC = 2 is the number of velocity components. Therefore,
the snapshot of the data at a time tk is a vector dk [i] ∈ Rns ×1 , while the temporal
evolution of the data at a location i is a vector di [k] ∈ Rnt ×1 . We then have D[i, k =
c] = dc [k] ∈ Rns ×1 and D[i = c, k] = dc [l] ∈ R1×nt .
nu
Õ
u[k] = u = cr br ⇐⇒ u = B uB , (8.1)
r=1
where B = [b1, b2, . . . bn B ] ∈ Cnu ×nb is the basis matrix having all the elements
of the basis along its columns, and uB = [c1, c1, . . . , cnb ]T is the set of coefficients
in the linear combination, that is, the representation of the vector in the new basis.
Computing the transform of a vector with respect to a basis B means solving a linear
system of algebraic equations. Such a system can have no solution, one solution, or
infinite solutions depending on nb and nu .
If nb < nu , as it is the case in model-order reduction, the system is overdetermined
and there is no solution.5 In this case, we look for the approximated solution ũB that
is obtained by projecting u onto the column space of B. This is provided by the well-
known least-squares approximation, which gives6
u = B uB =⇒ ũB = (B† B)−1 B† u =⇒ ũ = B(B† B)−1 B† u = PB u . (8.2)
The least-squares solution minimizes ||u − B uB ||2 = ||e||2 ; the minimization
imposes that the error vector e is orthogonal to the column space of B. In the machine
learning terminology, the matrix PB = B(B† B)−1 B† is an autoencoder that maps a
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
158 Data-Driven Fluid Mechanics Chapter 8
signal in Rnu ×1 to Rnb ×1 (this is an encoding) and then back to Rnu ×1 (this is the
decoding).
Because the underlying linear system has no solution, the linear encoding does
not generally admit an inverse if nb < nu : it is not possible to retrieve u from ũB –
that is, the autoencoding loses information. A special case occurs if nb = nu . Under
the assumption that the columns are linearly independent– that is, B−1 exists – there is
only one solution. It is easy to show that7 this yields PB = I: in this case ũ = u, and the
tilde is removed because the autoencoding is lossless. This is fundamental in filtering
applications for which the basis matrix is usually square: a signal is first projected
onto a certain basis (e.g., Fourier or wavelets), manipulated, and then projected back.
The last possibility, that is nb > nu , results in an underdetermined system. In this
case, there are infinite solutions. Among these, it is common practice to consider the
one that leads to the least energy in the projected domain, that is, such that min(||uB ||).
This approach, which also yields a reversible projection, is known as the least norm
solution and reads
u = B uB ⇐⇒ uB = B† (BB† )−1 u . (8.3)
It is now interesting to apply these notions to decompositions (projections) in space
and time domains. Considering first the projection in the space domain, let the signal
in (8.2) be a column of the data set matrix, that is, u := dk [i] ∈ Rns ×1 . Let Φ =
[φ1, φ2, . . . , φ nφ ] ∈ Cns ×nφ denote the spatial basis. Because the matrix multiplication
acts independently on the columns of D, (8.1) is
dk = Φ dφ ⇒ d̃kφ = (Φ† Φ)−1 Φ† dk ; D = ΦDφ ⇒ D̃φ = (Φ† Φ)−1 Φ† D. (8.4)
Here the transformed vector is d̃kφ while the matrix D̃φ collects all the trans-
formed vectors, that is, the coefficients of the linear combinations of basis elements
{φ1, φ2, . . . , φ R } that represents a given snapshot dk .
The same reasoning holds for transforms in the time domain. In this case, let
the signal be a row of the data set matrix, that is, u := dTi [k] ∈ Rnt ×1 . Defining
the temporal basis matrix as Ψ = [ψ1, ψ2, . . . , ψnψ ] ∈ Cnt ×nψ and handling the
transpositions with care,8 the analogous of (8.4) in the time domain reads
dTi = Ψ dTψ ⇒ d̃Tiψ = (Ψ† Ψ)−1 Ψ† dTi ; D = Dψ ΨT ⇒ D̃ψ = DΨ(Ψ† Ψ)−1 . (8.5)
Here the transformed vectors are d̃iψ and the matrix D̃ψ collects in its columns all
the transformed vectors, that is, the coefficients of the linear combinations of basis
elements {ψ1, ψ2, . . . , ψR } that represent the data evolution at a location di [k].
Note that approximations can be obtained along space and time domains as
D̃φ = Φ(Φ† Φ)−1 Φ† D and D̃ψ = DΨ(Ψ† Ψ)−1 ΨT . (8.6)
7
−1
Use the distributive property of the inversion to show that PB = B(B† B)−1 B† = BB−1 B† B = I .
8 More generally, in a matrix multiplication AB, the matrix A is acting on the columns of B. The same
action along the rows of B is obtained by ABT .
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 159
These approximations are exact (the projections are reversible) if the bases are
complete, that is, the matrices Φ and Ψ are square9 (nφ = ns and nψ = nt ). In these
cases, autoencoding is loseless: D̃φ = D and D̃ψ = D.
It is left as an exercise to see how (8.4)–(8.6) simplify if the bases in the space and
time are also orthonormal, that is, the inner products yield Φ† Φ = I and Ψ† Ψ = I
regardless of the number of basis elements10 nφ and nψ . On the other hand, from the
fact that (8.6) is exact only for complete bases, one can see that ΦΦ† = I only if
nφ = ns and ΨΨ† = I only if nψ = nt .
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
160 Data-Driven Fluid Mechanics Chapter 8
of basis elements (i.e., its “modes”) is ns × nt . While most of the data have a sparse
representation in a 2D Fourier or wavelet bases,12 this representation is still inefficient.
The inefficiency stems from the lack of a “variable separation”: the entry Dφψ [m, n]
measures the correlation with a basis matrix constructed from the mth basis element of
the column space and the nth basis element of the row space. Even in the ideal case of
orthonormal bases on both columns and rows (i.e., space and time), the decomposition
in (8.7) requires a double summation:
s −1 n
nÕ Õ t −1
In modal analysis, we seek a transformation that renders Dφψ diagonal and that has
no more than R = min(ns , nt ) modes. We seek separation of variables and hence enter
data-driven modal analysis from Section 8.2.4.
Each of the terms σr φr ψrT produces a matrix of unitary rank, which we call mode.
Note that the rank of a (full rank) rectangular matrix is rank(D) = min(ns , nt ), but
in general the number of relevant modal contributions is not associated to the rank.13
We obtain approximations of the data set by zeroing some of the entries along the
diagonal of Σ. As in Section 8.2.3 we use tildes to denote approximations. Moreover,
since infinite decompositions could be obtained by dividing the diagonal entries σr =
Σ[r, r] by the length of the corresponding basis matrices, we here assume that ||φr || =
||ψr || = 1 for all r ∈ [0, . . . , R − 1].
The reader has undoubtedly recognized that the factorization in (8.11) has the same
structure of the singular value decomposition (SVD) introduced in Chapter 6. The
SVD is a very special case of (8.11), but there are infinite other possibilities, depending
on the choice of the basis. Nevertheless, such a choice has an important constraint that
must be discussed here.
Assume that the spatial structures Φ are given. Projecting on the left gives
σ1 ψ1 [1] σ1 ψ1 [2] ... σ1 ψ1 [nt ]
σ2 ψ2 [1] σ2 ψ2 [2] ... σ2 ψ2 [nt ]
(Φ Φ) Φ D = Σ Ψ = Dφ =
† −1 † T
.. .. .. .. . (8.12)
. . . .
σ ψ [1] σR ψR [2] ... σR ψR [nt ]
R R
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 161
This is equivalent to (8.4): this is a transform along the columns of D. The transformed
snapshot, that is, the set of coefficients in the linear combination of φ’s, changes from
snapshot to snapshot – they evolve in time. The key difference with respect to a 2D
transform is that the evolution of the structure in the basis element φr only depends on
the corresponding temporal structure ψr . Moreover, notice that σr = ||σr ψr ||, since
||ψr || = 1. This explains why the amplitudes are real quantities by construction: they
represent the norm of the rth column Dφ , which we denote as Dφ [:, r] following a
P YTHON notation.
The same observation holds if the temporal structures Ψ are given. In this case,
projecting on the right gives
This is equivalent to (8.5): this is a transform along the rows of D. The set of
coefficients in the linear combination of ψ’s are spatially distributed according to
the corresponding spatial structures φ’s. As before, we see that the amplitudes of the
modes can also be computed as σr = ||σR φr || = ||Dψ [:, r]||.
Figure 8.1 The “space view” and the “time-view” of modal analysis. In the “space view,”
every snapshot dk [i] is a linear combination of basis elements φ’s. The coefficients of this
combination evolve in time according to the associated ψ’s. In the “time-view,” every temporal
evolution di [k] is a linear combination of basis elements ψ’s. The coefficients of this
combination are spatially distributed according to the associated φ’s.
The space-time symmetry is further elucidated in Figure 8.1, considering also the
√ √
cases in which Φ := I/ ns or Ψ := I/ nt . In the “space view” on the left, we
follow the data in time from a specific spatial basis. If this basis is the set of impulses
δk [i], the temporal evolutions along each element of the basis is given by the time
evolution di [k]. In the “time view” on the right, we analyze the spatial distribution
from a specific temporal basis. If this basis is the set of impulses δi [k], the spatial
distribution of each member of the basis (i.e., every instant) is given by the snapshot
dk [i] itself.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
162 Data-Driven Fluid Mechanics Chapter 8
With these views in mind, the reader should understand the most important
observations of this section: in the decomposition in (8.11) it is not possible to impose
both basis matrices Φ and Ψ– given one of the two, the other is univocally determined.
This observation also leads to the formulations of two general algorithms to compute
this factorization given one of the two bases. These are listed below:
Finally, note that these algorithms are the most general ones to complete a
decomposition from its spatial or its temporal structures. Such a level of generality
highlights the common structure but also leads to the least efficient approach: every
decomposition offers valuable shortcuts that we discuss in Section 8.3.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 163
1 1 1 1
||φr (x)|| 2 = ||φr || 2 = φr† φ and ||ψr (t)|| 2 = ||ψr || 2 = ψr† ψ , (8.15)
ns ns nt nt
where the norms are simply Euclidean norms ||a|| = a † a. For nonuniform meshes or
more sophisticated integration schemes, weighted inner products must be introduced
(see Chapter 6).
Consider now an approximation of the data using only one mode. This is a matrix
D̃[i, k] = σr φr [i]ψr [k] of unitary rank. The total energy associated to this mode,
assuming that this is the discrete version of a continuous space-time evolution, is
ns Õnt
1 1 Õ
∫ ∫
2
E{ D̃[x, t]} = D̃ (x, t)dΩdt ≈ D̃2 [i, k] . (8.16)
ΩT T Ω nt ns i=0 k=0
15 Recall: the Frobenious norm of a matrix A ∈ C n s ×n t can be written as | | A| | F = tr{ A† A} = tr{ AA† }.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
164 Data-Driven Fluid Mechanics Chapter 8
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 165
On the other hand, we know from the convolution theorem that the same can be
achieved in the frequency domain using an entry-by-entry multiplication of the Fourier
transform of the signals and the one of the impulse response. In terms of matrix
multiplications, these are, respectively,
u = ΨFu ,
b y = Ψ F y , and H
b b = ΨFh , (8.24)
b u → Ψ F y = H Ψ F u → y = Ψ F H Ψ F u hence Ch = Ψ F H Ψ F .
y = Hb (8.25)
The last equation on the right is obtained by direct comparison with (8.24) and is
extremely important: this is an eigenvalue decomposition. Each H[n] b entry of the
frequency response vector is an eigenvalue of Ch and the Fourier basis element ψ Fn
(corresponding to the frequency fn ) is the associated eigenvector.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
166 Data-Driven Fluid Mechanics Chapter 8
Of great interest is the case in which the impulse response is even, that is, h[k] =
h[−k]. In this case the circulant matrix Ch becomes also symmetric and its eigenvalues
are real: this is the case of zero-phase (noncausal) filters. In this special case, also
the eigenvectors can be taken as real harmonics: they could be either sinusoidals or
cosinusoidals, that is, the basis of the discrete sine transform (DST) or the discrete
cosine transform (DCT) (see Strang 2007). In a completely different context, this
eigenvalue decomposition also appears in the POD of a special class of signals.
Because of its importance, the reader is encouraged to pause and practice with the
following exercise.
Exercise 1: The DFT and the Diagonalization of Circulant Matrices
The subscript P is used to distinguish the POD. The notation on the right is based
on an outer product representation of the eigenvalue decomposition of symmetric
matrices; this will be useful in Section 8.4. The first key feature of the decomposition
is that the eigenvalues of K are linked to the POD amplitudes as Λ = Σ2P . Hence the
diagonalization in (8.26) also provides the POD amplitudes and the normalization
in line 4 of Algorithm 2 is not needed. Introducing Ψ P in this algorithm gives
the Sirovinch formulation16 described in Chapter 6. Observe that since the POD
amplitudes are the singular values of the data set matrix, it is possible to compute
the convergence in (8.19) without computing norms:
16 Actually a much less efficient version: the projection step in line 1 of the algorithm could simply be
D̃ = DΨ P .
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 167
v
u
2
σPr
t ÍR−1
||D − D̃(r̃)||2 r=r̃
E(r̃) = = . (8.27)
||D||2
ÍR−1 2
σPr
r=0
For the following discussion, two observations are of interest. The first is that
introducing the POD factorization (i.e., the SVD) in (8.26), we have
K = Ψ P Σ−1 T −1 T
P ΦP ΦP Σ P ΨP i.e., Λ = Σ−1
P ΦP ΦP Σ P .
T −1
(8.28)
The last step arises from a direct comparison with (8.26). Because Λ is diagonal, we
see that we must have ΦTP Φ P = I, that is, the spatial structures are also orthonormal.
We know this from Chapter 6, where it was shown that these are eigenvectors of
the spatial correlation matrix. The key observation is that the reverse must also be
true: every decomposition that has orthonormal temporal and spatial structure is a
POD. This brings us to the second observation on the uniqueness of the POD. In
a data set that leads to modes of equal energetic importance, the amplitudes of the
associated POD modes tend to be equal. This means repeated eigenvalues of K and
thus nonunique POD. In the extreme case of a purely random data set, it is easy to
see that the POD modes are all equal (see Mendez et al. 2017), and there are infinite
possible PODs. The impact of noise in a POD decomposition is further discussed in
Chapter 9.
Finally, observe that the POD is based on error minimization (or, equivalently,
amplitude maximization) and has no constraints on the frequency content in its
temporal structures Ψ P . However, a special case occurs in an ideally stationary
process. In such a process, the temporal correlations are invariant with respect to time
delays and solely depend on the time lag considered in the correlation. Hence the
correlation K[1, 4] = dT1 d4 is equal to K[4, 7] = dT4 d7 or K[11, 14] = dT11 d14 , for
example.
In other words, the temporal correlation matrix K becomes circulant, like the
matrix Ch in (8.23) for the convolution. Therefore, its eigenvectors are harmonics:
we conclude that the POD of an ideally stationary data set is either a DCT or a
DST.17 This property is the essence of the spectral POD proposed by Sieber et al.
(2016), which introduces an ingenious FIR filter along the diagonal of K to reach
a compromise between the energy optimality of the POD modes and the spectral
purity of Fourier modes. Depending on the strength of this filter, the SPOD offers
an important bridge between the two decompositions.
17 Depending on whether the temporal average has been removed. The mean is accounted for in a DCT but
not in a DST.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
168 Data-Driven Fluid Mechanics Chapter 8
s −1
nÕ s −1
nÕ
di [k + 1] = ar ψ Dr [i]e−pr tk = ar ψ Dr [i]λrk−1 . (8.29)
r=0 r=0
The subscript D is used to distinguish the DMD. In Chapter 4, we have seen that
the complex exponentials are eigenfunctions of linear dynamical systems; Chapter
10 describes their state-space representation. Observe that the DMD is a sort of Z-
transform in which the basis elements only include the poles of the systems (see
Chapter 4), that is, the eigenvalues of the propagating matrix A in the state-space
representation in Chapter 10. Such decomposition is natural for a homogeneous
linear system, while the extension of the DMD for the forced system is proposed by
Proctor et al. (2016). This extension makes the DMD an extremely powerful system
identification tool that can be combined with the linear control methods introduced in
Chapter 10.
Many algorithms have been developed to compute the eigenvalues (λr ’s) from the
data set (see Chapter 7). Once these are computed, the matrix containing the temporal
structures of the DMD can be constructed,
1
1 1 ... 1
λ λ2 λ3 ... λnt
1
2
= λ1 λ22 λ32 ... λn2 t ∈ Cnt ×nt ,
ΨD (8.30)
. .. .. ..
..
. . .
nt
λ λ2nt λ3nt ... λnntt
1
and the decomposition completed following Algorithm 2.
From a dynamical system perspective, we know from Chapter 4 that a system is
stable if all its poles are within the unit circle, hence if all the λ’s have modulus
|λ| ≤ 1. From an algebraic point of view, we note that Ψ D differs from the temporal
structures of the other decompositions in two important ways. First, it is generally not
orthonormal: the full projection must be considered in line 1 of Algorithm 2. Second,
its inverse might not exist: convergence is not guaranteed.
Different communities have different ways of dealing with this lack of convergence.
In the fluid dynamics community, more advanced DMD algorithms, such as the
sparsity promoting DMD (Jovanović et al. 2014) or the optimized DMD (Chen
et al. 2012), have been developed to enforce that all the λ’s have unitary modulus. This
is done by introducing an optimization problem in which the cost function is defined
on the error minimization of the DMD approximation. Consequently, vanishing or
diverging decompositions are penalized. Observe that having all the modes on the
unit circle does not necessarily imply that these are DFT modes: the DMD does not
impose any orthogonality condition, which would force the modes to have frequencies
that are multiples of a fundamental one. If orthogonality is enforced as a constraint to
the optimization, then the only degree of freedom distinguishing DMD and DFT is
in the choice of the fundamental tone: while this is T = nt ∆t for the DFT, the DMD
can choose different values and bypass problems like spectral leakage or windowing
(Harris 1978).
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 169
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
170 Data-Driven Fluid Mechanics Chapter 8
are cosine functions. The solution via EF expansion (see Mendez & J.M-
Buchlin 2016) is
∞
Õ
û( ŷ, tˆ) = φ En ( ŷ) σEn ψ En (tˆ), (8.34)
n=1
with
π
φ En = cos (2n − 1) ŷ ,
2
16 p̂ A
σEn = , (8.35)
(2n − 1)π 16W 4 + (2n − 1)4 π 4
p
2
W
ψ En = (−1) cos t − tan
n ˆ −1
.
[(2n − 1)π]
Note that the spatial structures are orthogonal but the temporal ones are not.
The amplitudes of the modes decay as ∝ 1/(2n − 1)3 when W → 0 and
as ∝ 1/(2n − 1) when W → ∞. Finally, note that all the amplitudes tend
to σEn → 0 as W → ∞: as the frequency of the perturbation increases,
the oscillations in the velocity profile are attenuated. Consider a case with
W = 10 and p̂a = 60. Several dimensionless velocity profiles during the
oscillations are shown in the left panel of the figure.
In this exercise, the reader should compare the DFT, the DMD, and the POD
with the eigenfunction solution. Assume that the space discretization consists
of ny = 2000 while the time discretization consists of nt = 200 with a
dimensionless sampling frequency of fˆs = 10.
First, construct the discrete data set from (8.34) to (8.35) by setting it in
terms of the canonical factorization in (8.11). Then, prepare a function that
implements Algorithm 2 described in Section 8.2.4 and use this algorithm to
compute the DFT, POD, and DMD. Finally, plot the amplitude decay of all the
decompositions and show the first three dominant structures in space and time
for each.
0.50
0.25
0.00
û
−0.25
−0.50
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 171
This section is composed of three parts. We first analyze in Section 8.4.1 the impact
of frequency constraints on the POD. In particular, we are interested in what happens
if we filter a data set before computing the POD. We will see that if a frequency is
removed from the data, it is also removed from the temporal structures of its POD.
We extend this result to a decomposition of the data via MRA (see Chapters 4 and 5).
The MRA uses a filter bank to break the data into scales, and we here see how to keep
the PODs of all scales mutually orthogonal. Then, the POD bases of each scale can be
assembled into a single orthonormal basis. That is the basis of the mPOD – the mPOD,
presented in Section 8.4.2. Finally, the mPOD algorithm is described in Section 8.4.3.
18 Which thus implies that the filter cannot be “ideal,” as discussed in Chapter 4.
19 √
To be normalized by 1/ ns to keep it of unitary length.
20 Note that the right multiplication by H is equivalent to the Hadamard product by H.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
172 Data-Driven Fluid Mechanics Chapter 8
where is the Hadamard product, that is, the entry-by-entry multiplication between
two matrices. Observe that D bH Ψ F is the inverse Fourier transform of DbH (i.e., the
frequency spectra of the filtered data) while D b = DΨ F is the Fourier transform of D
along its rows (i.e., in the time domain).
The representation on the left is a direct consequence of the link between convolu-
tion theorem and eigenvalue decomposition of circulant matrices introduced in (8.25).
The representation on the right opens to an intuitive and graphical representation of
the filtering process that is worth discussing briefly.
First, we introduce the cross-spectral density matrix K F . This collects the inner
product between the frequency spectra of the data evolution; it is the frequency domain
analogous of the temporal correlation matrix K . These matrices are linked:
KF = Db† Db = Ψ F D† D Ψ F = Ψ F K Ψ F ⇐⇒ K = Ψ F K F Ψ F . (8.38)
† t −1
nÕ
K F = Ψ F Ψ P Σ2P ΨTP Ψ F = Ψ
b P Σ2 Ψ
b P, that is, K F [i, j] = σP2 ψ
bP [i]ψ
bP [ j].
P
r=0
(8.39)
We thus see that the eigenvectors of K F are the conjugate of the Fourier transform
of the eigenvectors of K. The outer product notation on the right shows that the
diagonal of this matrix contains the sum of the power spectra of the temporal structures
of all the modes. This is the sum of positive real quantities.
Consider now the cross-spectral density matrix of the filtered data in (8.36) and use
the distributive property of the Hadamard product to get
†
b† D
KF = D b = D bH = D b H † H = K F H,
b† D (8.40)
bH D
H H
t −1
nÕ
K F H = K F H ⇐⇒ K F H [i, j] = σP2 H ψ
bP H [i]ψ
bP H [ j]. (8.41)
r=0
Since K F H ≈ 0 outside the band-pass range of the 2D filter, and since the
entries along its diagonals are a summation of positive quantities, we conclude that
frequencies removed from the data cannot be present in any of the POD modes.
Moreover, we can compute the POD modes of the filtered data from the filtered
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 173
Low-pass filtering
Band-pass filtering
High-pass filtering
Figure 8.2 Pictorial representation of the magnitudes of the frequency transfer functions of the
filters considered in this section. The figure on the left shows the magnitude of the 1D
frequency response H. b On the right, the figure shows how 2D filters H = H † H are
constructed from the extended response H. The red region represents the passband portion,
that is, within which |H| ≈ 1 while in the gray area it is |H| ≈ 0.
cross-spectral matrix K F H or, more conveniently, from the temporal correlation matrix
K H . The link between these matrices can be written as
K H = D†H D H = Ψ F K F H Ψ F = Ψ F K bH = Pπ K F H ,
bH Ψ F ⇐⇒ K (8.42)
where the definition of 2D Fourier transform in (8.8) is introduced and Pπ = Ψ F Ψ F
is the permutation matrix obtained by multiplying the Fourier matrix twice21 :
1 0 ... 0 0
0 0 0 1
..
Pπ = Ψ F Ψ F = Ψ F Ψ F = 0 0 1 . . (8.43)
. . . ..
.. .. .. .
0 1
0 ... 0
Finally, to answer the last question from this section, use the multiplication by the
diagonal matrix H in (8.39) instead of the Hadamard product:
K F H = DHb † DH = H†Db† DH = H † K F H . (8.44)
b
21 A funny question arose after reading this: what happens if the Fourier transform is performed four
times?
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
174 Data-Driven Fluid Mechanics Chapter 8
F1
1.0 F2
0.5
û(ŷ = 0, t)
0.0
−0.5
−1.0
Time evolution of the velocity profile at the center of the channel when each
of the two forcing terms F1 and F2 is active.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 175
consider three scales; these are identified by three filters with band-pass bandwidths
∆ f1 = [0, f1 ], ∆ f2 = [ f1, f2 ] and ∆ f3 = [ f2, fs /2]. We lump this information in a
frequency splitting vector FV = [ f1, f2, f3 ]. With three scales, the temporal correlation
matrix has nine partitions:
3 Õ
3
† Õ
K = D † D = D1 + D2 + D3 D1 + D2 + D3 = Di† D j . (8.45)
i=1 j=1
Following an MRA formulation, we assume that these scales are isolated by filters
with complementary frequency response, that is,
Õ
Di = D Ψ F H i Ψ F with H i ≈ 1 and H i H j ≈ 0. (8.46)
This ensures a lossless decomposition of the data. We now use (8.39) and (8.45) in
(8.44) to analyze which portion of the cross-spectral density K F is taken by each of
the contributions in (8.44). Figure 8.3 gives a pictorial view of the partitioning, which
can be constructed following the graphical representation in Exercise 2. In wavelet
terminology, the term H †1 H 1 is the approximation term at the largest scale. The other
“pure” terms H †2 H 2 and H †3 H 3 are diagonal details of the scales 2 and 3. The terms
H †i H j with i > j are horizontal details while those with j < i are vertical details.
Figure 8.3 Repartition of the cross-spectral density matrix into the nine contributions
identified by three scales. The origin is marked with a red circle. The “mixed terms” H †i H j
with i , j are colored in light gray, while the “pure terms” H †i H i are colored following the
legend on the left, where the 1D transfer functions are also shown.
We have seen in Section 8.4.1 that removing a frequency from the data set removes
it from the temporal structures of its POD. There is thus no frequency overlapping22
between the eigenvectors of the contributions K1 = D1† D1 , K2 = D2† D2 , K3 = D3† D3 ,
which we denote as Km = Ψ Hm Λ Hm ΨTHm . However, the same is not necessarily true
for the correlations of the “mixed terms” Di† D j with i , j: the filters H †i H j with
i , j leave the diagonals of K F unaltered and thus the spectral constraints are much
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
176 Data-Driven Fluid Mechanics Chapter 8
weaker: it is possible that eigenvectors from D2† D3 , for example, have frequencies that
are already present among the eigenvectors of D2† D2 or D3† D3 .
We opt for a drastic approach: remove all the “mixed terms” Di† D j with i , j and
consider the correlation as the sum of M “pure terms”,
M
Õ M
Õ M
Õ
†
Dm = b H †m H m Ψ F = Ψ Hm Λ Hm ΨTHm , (8.47)
K≈ Dm ΨF K
m=1 m=1 m=1
We thus conclude that a complete and orthonormal basis for Rnt can be constructed
from the eigenvectors of the various scales. That is the mPOD basis.
We close this section by highlighting how the mPOD connects POD and DFT or
DMD. In case of no frequency partitioning, the mPOD is a POD: the temporal struc-
tures can span the entire frequency range and are derived, as described in Section 8.2.4
and Chapter 6, under the constraint of optimal approximation for any given number
of modes. Introducing the frequency partitioning and the approximation in (8.46), we
identify modes that are optimal only within the frequency bandwidth of each scale.
The optimality of the full basis is lost as modes from different scales are not allowed
to share the same frequency bandwidth. As we introduce finer and finer partitioning,
each mode is limited within a narrower frequency bandwidth. At the limit for nm = 1
and M = nt , every mode is allowed to have only one frequency. The approximation
in (8.46) forces the spectra of the temporal correlation matrix to be diagonal, that is,
the correlation matrix is approximated as a Toeplitz circulant matrix. Accordingly, the
mPOD tends toward the DFT or DMD depending on the boundary conditions used
in the filtering of K. If periodicity is assumed, the DFT is produced. If periodicity
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 177
is not assumed, the decomposition selects harmonic modes whose frequency is not
necessarily a multiple of the observation time, and a DMD is recovered.
The first step computes the temporal correlation matrix, as in the POD. The second
step prepares the filter bank, according to the introduced frequency vector FV . The
third step computes the MRA of the temporal correlation matrix; the fourth step
computes the eigenvectors of each.23
At this stage, we can proceed with the preparation of a single basis. First, in
step 5, all the temporal structures and associated eigenvalues are collected into a single
matrix Ψ0 and a single vector of eigenvalues λ 0 . These eigenvalues give an estimation
of the relative importance of each term. This is sorted in descending order and the
information is used to permute the columns of Ψ0 using an appropriate permutation
matrix PΛ .
If the filters were ideal, the resulting temporal basis Ψ1 = Ψ0 PΛ would be com-
pleted. However, to compensate for the nonideal frequency response of the filters, a
reduced QR factorization is used in step 7 to enforce the orthonormality of the mPOD
basis. The result of the orthogonalization procedure is the matrix of the temporal
structure of the mPOD. The final step is the projection to compute the spatial structures
that is common to every decomposition and that can be computed using24 Algorithm 2.
Note that special attention should be given to the filtering process in the third
step. If this is done as in Exercise 4, it is implicitly assumed that the matrix K is
23 Can we by-pass this step? Can we compute all the Ψ m ’s from one single diagonalization (of a properly
filtered matrix)? These were two brilliant questions by Bo B. Watz, development engineer at Dantec
Dynamics, who attended the lecture series. The answer is that it is possible, in principle, to compute the
basis from one clever diagonalization. However, the filtering is challenging: that was the starting point
of new exciting developments toward the fast mPOD.
24 Observe that a simplified version could be used since the temporal structure is orthonormal!
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
178 Data-Driven Fluid Mechanics Chapter 8
periodic. When such an assumption is incorrect, edge effects could appear after the
filtering process. The cure for these effects is described in classical textbooks on image
processing (see Gonzalez & Woods 2017) and include reflections or extrapolations. To
limit the scope of this chapter, we do not address these methods here.25 The reader
should nevertheless be able to construct the mPOD algorithm using all the codes
developed for the previous exercises.
Combine the codes from the previous exercises to build your own function for
computing the mPOD, following Algorithm 3. This function should compute
Ψ M , while the decomposition can be completed using Algorithm 2.
Assume that filters are constructed using Hamming windows, and the filter
order is an input parameter. Test your function with the previous exercise and
show the structures of the first two mPOD modes. Compare their amplitudes
with the ones of the POD modes.
In addition to the coding exercises, this chapter includes two tutorial test cases from
time-resolved particle image velocimetry measurements. The first data set collects the
velocity field of a planar impinging gas jet in stationary conditions; the second is the
flow past a cylinder in transient conditions. These are described in Mendez, Balabane
and Buchlin (2019) and Mendez et al. (2020). Other examples of applications of the
mPOD on experimental data can be found in Mendez et al. (2018), Mendez et al.
(2019) and Esposito et al. (2021). More exercises and related codes can be found
in https://github.com/mendezVKI/MODULO, together with an executable with
graphical user interface (GUI) developed by Ninni and Mendez (2021).
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 179
The data set consists of nt = 2000 velocity fields, sampled at fs = 2 kHz over a grid
of 60 × 114 points. The spatial resolution is approximately ∆x = 0.33 mm. The script
provided in the book’s website26 describes how to manipulate this data set, plot a time
step, and more. For the mPOD of this test case, the frequency splitting vector used
in Mendez, Balabane and Buchlin (2019) is constructed in terms of Strouhal number
St = f H/UJ , that is, the dimensionless frequency computed from the advection time
H/UJ , with H the standoff distance and UJ the mean velocity of the jet at the nozzle
outlet. These are H = 4mm and UJ = 6.5 m/s.
The reader is encouraged to compare the mPOD results with those achievable from
other decompositions. The spatial structures and the spectra of the temporal structures
in the first five modes are produced by the script T UT 1. PY. These are shown in Figure
8.4 for the dominant mPOD mode in the scale St = 0.1 − 0.2. This mode isolates the
roll-like structures produced by the evolution of the shear layer instability downstream
the potential core of the jet. Recall that the velocity components are stacked into a
single vector.
Figure 8.4 On the left: example snapshot from the first tutorial test case on the TR-PIV of an
impinging gas jet. Colormap in m/s. On the right: example of spatial structure (top) and
frequency content (bottom) of the temporal structures of a mPOD mode.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
180 Data-Driven Fluid Mechanics Chapter 8
13
12
U∞ [m/s ]
11
10
8
0 1 2 3 4
t[s]
Figure 8.5 Top left: a snapshot of the velocity field past a cylinder, obtained via TR-PIV.
Colormap in m/s. Bottom left: evolution of the free-stream velocity as a function of time,
sampled from the top left corner of the field. Right: example of spatial structure (top) and
frequency content (bottom) of the temporal structures of an mPOD mode.
As for the previous tutorial, a function is used to extract all the information about
the grid (in this case stored on a different file) and the velocity field.
This test case is characterized by a large-scale variation of the free-stream velocity.
A plot of the velocity magnitude in the main stream is shown in Figure 8.5. In the first
1.5 s, the free-stream velocity is at approximately U∞ ≈ 12 m/s. Between t = 1.5 s
and t = 2.5 s, this drops down to U∞ ≈ 8 m/s. The variation of the flow velocity is
sufficiently low to let the vortex shedding adapt, and hence preserve an approximately
constant Strouhal number of St = f dU∞ ≈ 0.19, with d = 5 mm the diameter of the
cylinder. Consequently, the vortex shedding varies from f ≈ 459 Hz to f ≈ 303 Hz.
Interestingly, the POD assigns the entire evolution to a single pair of modes, and it is
hence not possible to analyze the vortex structures in the shedding for the two phases
at approximately constant velocity, nor to distinguish them from the flow organization
during the transitory phase.
The mPOD can be used to identify modes related to these three phases. Three
scales are chosen in the exercise. The first, in the range ∆ f = [0 − 10] Hz, is designed
to isolate the large-scale motion of the flow, hence the variation of the free-stream
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
Mendez: Generalized and Multiscale Modal Analysis 181
velocity. The scale with ∆ f = [290–320] Hz is designed to capture the vortex shedding
when the velocity is at U∞ ≈ 8m/s, while the scale with ∆ f = [430–470] Hz identifies
the shedding when the velocity is at U∞ ≈ 12 m/s.
The spatial structure and the temporal evolution of an mPOD mode are also shown
in Figure 8.5. This mode is clearly associated to the second stationary conditions.
Other modes capture the first part while others are focused on the transitory phase:
besides allowing for spectral localization, the MRA structure of the decomposition
also allows for providing time localization capabilities.
h d i 0 0 9 8 088962 4 0 3 bli h d li b C b id i i
9 Good Practice and Applications
of Data-Driven Modal Analysis
A. Ianiro
This chapter1 develops a theoretical background for the definition of good practice
(number of samples and sampling time versus the effect of noise) for data-driven
modal analysis and shows an example application to a channel-flow data set. Snapshot
proper orthogonal decomposition (POD) is employed as a benchmark allowing to
move the discussion toward the convergence of the flow statistics and the effect of
measurement noise on the temporal correlation matrix.
Beyond the estimation of the effect of noise on the estimated modes, this chapter
also presents several application examples showing how hidden information can be
extracted from a well-converged POD. In particular, it is shown how temporal modes
of non-time-resolved data can provide detailed phase information and how extended
POD modes can provide a linear stochastic estimation of correlated events. This
is useful for applications that involve flow sensing and for the study of convection
problems.
9.1 Introduction
When analyzing flow data sets, both from experiments and simulations, the researcher
is challenged by important questions about the needed size and completeness of the
data set.
The basic question that the researcher might ask him/herself is how many samples
do I need to store to obtain a reliable modal analysis? Considering, for instance, the
shedding phenomenon of the wake of a cylinder as discussed in Chapter 1, it is rather
intuitive that it is needed to correctly sample several phases of the shedding period and
to acquire enough samples per phase to achieve a satisfactory convergence of the flow
statistics and thus of the correlation matrix. Unfortunately, data storage, simulation or
experiment duration, and processing time might limit the amount of samples stored.
The question previously identified can be further subdivided into two questions: while
acquiring a data set, what should be the acquisition frequency? Moreover, given a
1 Andrea Ianiro acknowledges the support of his colleagues and friends Prof. S. Discetti and Prof. M.
Raiola. The majority of the original concepts summarized in the present chapter is the result of a
fruitful collaboration at UC3M in the last eight years. The application of the turbulent heat transfer in a
pipe is the result of a collaboration with Dr. A. Antoranz, Prof. O. Flores and Prof. M. Garcı́a-Villalba.
A. Antoranz, O. Flores and M. Garcı́a-Villalba are kindly acknowledged for sharing the turbulent-pipe
data set.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 183
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
184 Data-Driven Fluid Mechanics Chapter 9
problems (Antoranz et al. 2018), allowing us to obtain composite modes, for example,
modes of temperature and velocity that provide a clear description of the scalar
transport mechanisms. A MATLAB
R
exercise focusing on the analysis of a data set
from the work by Antoranz et al. (2018) is proposed at the end of the chapter.
Φ = Y ΨΣ−1 . (9.3)
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 185
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
186 Data-Driven Fluid Mechanics Chapter 9
The autocorrelation coefficient of the data ρ is an indicator of how much the number
of samples should be increased in order to attain statistical convergence, for example,
for time-resolved fields in which ρ is likely to be of the order of 0.8, a nine-time-
larger number of samples is required to obtain the same statistical convergence as if
the samples were statistically independent. This last statement is equivalent to saying
that, if we observe a convective flow with a certain convective velocity, to obtain a
good convergence we should make sure that the total sampling time is at least one
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 187
order of magnitude larger than the time to convect from one measurement point to
another (in order to obtain at least 100 independent samples for each measurement
point). However, due to spatial correlation of the observed flow features this sampling
time should be, in principle, even larger and should be one or two orders of magnitude
larger than the characteristic convective time of the largest flow features analyzed.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
188 Data-Driven Fluid Mechanics Chapter 9
Comparing the solution of the eigenvalue problem of the measured and true
covariance matrices, it is possible to write that
Ψ∗ Y ∗ Y = Ψ∗ Ỹ ∗ Ỹ + nx σe2 I = Λ̃Ψ∗,
(9.9)
∗ ∗
Ψ̃ Ỹ ∗ Ỹ = Λ̃Ψ̃ .
If the random-error eigenvalues are small with respect to the difference between two
successive eigenvalues λ̃i and λ̃i+1 of Ỹ ∗ Ỹ , it is possible to write that λi = λ̃i + nx σe2
and that the ith temporal mode of Y is approximately equal to the ith temporal mode
of Ỹ (Venturi 2006). The perturbation of the eigenvectors increases with an increasing
mode number as the variance content of the ith mode λ̃i approaches the value of nx σe2 .
These relationships can be accurately derived from matrix perturbation theory, along
with their bounds, and are a common assumption in perturbed principal component
analysis (PCA) applications (Huang et al. 2005). However, such a simplified descrip-
tion can be useful and is certainly sufficient to draw some conclusions and identify
some practical insights:
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 189
When dealing with flow-field measurements, experimental data sets are often sampled
at frame rates that are not sufficient to correctly describe the dynamics of the problem
under study. However, even when it is not possible to actually measure the frequency
of the flow unsteady phenomena, the POD temporal modes allow us to determine
relevant phase information. In case of phenomena characterized by dominant periodic
features (such as the shedding in the wake of a bluff body), these periodic features
can be usually represented with a compact subset of modes. In fact, a plot of the
correlation-matrix eigenvalues λi versus mode number usually shows a quite clear
spectral separation between first modes, accounting for the dominant periodic features,
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
190 Data-Driven Fluid Mechanics Chapter 9
Figure 9.1 Reconstruction errors with respect to true and measured fields (δRT and δRM ) (left
axis) and F (right axis) versus the number of modes used in the reconstruction. (From Raiola,
Discetti & Ianiro (2015), reproduced by permission from Springer Nature.)
and the following modes that describe the smaller-scale turbulent features. This
separation in the eigenspectra can be exploited to effectively extract periodic flow
features (Perrin et al. 2007) as detailed in the following.
Considering a temporally homogeneous flow or a temporally periodic phenomenon
(with period τ), the temporal correlation matrix is a function of solely the temporal
separation between two snapshots, that is, the temporal correlation between the time
t and the time t 0 is only a function of t − t 0, that is, Rt (t, t 0) = Rt (t − t 0). The
eigenfunctions of the temporal correlation matrix must thus be Fourier modes as
discussed in Chapter 6. In case of a shedding-dominated phenomenon in which
the most energetic flow features are characterized by a statistical periodicity with
a dominant frequency, it is safe to assume that the first POD modes, apart from
being orthogonal, also align to a Fourier decomposition of the field and show a
strong harmonic relation. Therefore, the POD temporal modes with larger eigenvalues,
obtained solving the eigenvalue problem of the temporal correlation matrix, can unveil
their relative phase information.
If we consider cases such as that of the vortices shed in the wake of a bluff body
or the vortices developed due to shear-layer instabilities in a jet, a traveling wave
is described by two high-energy modes (which are often found to be the first two
modes). Both modes have to share the same periodicity, that is, the shedding period
τ, thus, according to the orthogonality of the POD temporal modes, it is possible to
assume that the low-order reconstruction employing only the first two POD modes
would coincide with the decomposition (Raiola et al. 2016)
√
where σi = λi and ϑ = 2πt τ is the period phase. The functions ψ1 (ϑ) and ψ2 (ϑ) will
likely be two sinusoidal functions and must be in phase quadrature, that is, ψ1 ∝ sin(ϑ)
and ψ2 ∝ sin(ϑ + π/2). The scatter plot of the temporal modes ψ1 and ψ2 distributes
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 191
Figure 9.2 Instantaneous fluctuating vorticity (ωz ) field. (a) The DNS field used for this
benchmark. Magnified view of the: (b) DNS field, (c) measured field, (d) field reconstructed
with 1300 modes, (e) field reconstructed with 3000 modes. (From Raiola, Discetti & Ianiro
(2015), reproduced by permission from Springer Nature.)
in the neighborhood of a circle if ψ1 (ϑ) and ψ2 (ϑ) are sinusoidal functions. Since
temporal-mode vectors have a unitary norm and zero mean, the scatter plot will be in
the neighborhood of a goniometric circle (with radius 1) if the temporal modes are
multiplied by the factor ns /2.
p
In general, the scatter plot of the time coefficients might unlock information on the
phase and frequency relation between the first and other modes, often corresponding
to higher-order harmonics, thus shedding light on the interconnection between the
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
192 Data-Driven Fluid Mechanics Chapter 9
different flow features highlighted by the modal analysis. Assuming that the ith POD
mode is harmonically related and phase shifted with respect to the first mode, it is
possible to write that
p
ns /2ψi (ϑ) = sin (αi ϑ + δi ) , (9.12)
where αi is a positive integer and δi is the phase shift of the ith mode with respect to
the first mode.
In order to ascertain these frequency relations, the scatter plot of the POD temporal
modes ψ1 and ψi should be observed in search of Lissajous curves. A Lissajous curve
is the graph of a system of parametric equations of the type x = A sin(αt + δ) and
y = B sin(βt), which can describe even very complex harmonic motions. Visually, the
ratio α/β determines the number of ”lobes” of the figure. The procedure to identify the
harmonic relation and the phase shift for higher order modes relies on the simplifying
assumption that the first two modes are at the same frequency with a π/2 phase shift.
It is possible to extract the period phase from the time coefficients of the first two
modes, that is, for any snapshot tan(ϑ) = ψ1 /ψ2 . Subsequently, the positive integer αi
and the phase shift δi , which characterize the harmonic relation, can be extracted from
the solution of the optimization problem
p
argmin ns /2 ψi − sin (αi ϑ + δi ) , (9.13)
αi ∈N
δi ∈R
where αi and the phase shift δi are the free parameters to identify. An example of
this procedure is given in the following, where it is applied to the analysis of the flow
features in the wake of two tandem cylinders located near a wall.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 193
As reported in Figure 9.4, the first two modes account for almost
p 60% of the variance,
and the scatter plots of the temporal modes (multiplied times ns /2) return a perfect
goniometric circle confirming that these two modes are representative of a traveling
pattern. The asymmetry of the first POD mode highlights the strong interaction of the
wall boundary layer with the shedding in the wake, evident from the presence of near-
wall vorticity visualized in the first (and second mode, not reported here for brevity).
Higher-order modes are mostly characterized by intermittent release of vorticity at a
spatial and temporal frequency that is a multiple of the principal shedding frequency
as shown from the contour plot of the fourth spatial mode and outlined from the
Lissajous curves in the scatter plot of the fourth modes against the first one. This
is an asymmetric mode that models the cross-wise oscillation of the shedding wake.
Figure 9.3 Experimental setup for PIV measurements in the wake of two tandem cylinders in
ground effect. (Reprinted from Raiola, Ianiro & Discetti (2016), copyright 2016, with
permission from Elsevier.)
Until this point, this chapter has been centered on the analysis of a given data set
A. However, the availability of multiple data sets, for example, A and B, are rather
frequent. Subtracting the base conditions y0,A to A and y0,B to B, it is possible to
obtain YA and YB . Given these two data sets, the most straightforward task would
be to perform two separate modal analyses, as for instance performed by Mallor et al.
(2019). However, if the measured quantities are synchronized, it is tempting to perform
a combined analysis to extract information not only about the modes of a certain data
set, for example, A, but also about the features of the second data set (B) that correlates
with these modes.
To this purpose, the reader must remember that the POD spatial modes can be
obtained from a projection of the snapshot matrix onto the temporal-mode matrix as
in (9.3). This projection approach can be used to extend the POD to other quantities,
in a generalized approach named Extended POD (Borée 2003). The columns of Ψ A ,
that is, the POD temporal modes, form a basis in the Rns vector space; thus, it is
possible to use Ψ A as a basis for the projection also of the matrix YB , provided that
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
194 Data-Driven Fluid Mechanics Chapter 9
Figure 9.4 Modal analysis results for two tandem cylinders with L/D = 1.5 and G/D = 1. On
the top, the correlation matrix eigenvalues are reported along with the scatter plots of the first
temporal mode against the second and the fourth modes (temporal modes are multiplied by
ns /2). At the bottom of the figure, the spatial modes 1 and 4 are reported, highlighting the
p
fact that the first and the fourth modes have an even spatial frequency ratio with a phase shift
of π/4. (Adapted from Raiola, Ianiro & Discetti (2016), copyright 2016, with permission from
Elsevier.)
YB = ΦB A Σ B A Ψ∗A → ΦB A Σ B A = YB Ψ A, (9.14)
where the subscripts A and B refer to the quantities A and B, respectively. The columns
of ΦB A are thus the spatial modes of YB obtained from the projection on the temporal
modes of YA . The data sets A and B have to be captured/generated in the same time
reference frame; however, they might represent different physical quantities, and the
snapshots of A and B can have different numbers of elements.
While the POD temporal modes ψ Ai are estimated from the solution of the
eigenvalue problem of the correlation matrix YA∗ YA and are optimal for YA , the
extended spatial modes φ B A,i are not optimal nor are necessarily ordered by their
variance content σB2 i . The value of σB2 i that can be computed as the square root of the
variance of the columns of ΦB A Σ B A in order to obtain φ B A,i vectors of unitary norm.
It has to be remarked that the projection of the snapshot matrix B on the temporal
mode matrix Ψ A is energy preserving since Ψ A is a basis in the Rns vector space.
Due to the orthogonality of the temporal modes ψ Ai this has the consequence that
EPOD decomposition, although not being optimal is orthogonal. The orthogonality
principle has the advantage to guarantee that each EPOD mode σB A,i φ B A,i contains
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 195
only the part of B that is correlated with the contribution of the ith POD mode to
A. This concept is closely connected with the idea of the linear stochastic estimation
(LSE), first developed by Adrian (1975), a technique that is typically employed to
estimate an unknown quantity given a known quantity. As the name suggests, this
technique attempts to statistically draw a linear relation between a known quantity
and another quantity that has to be estimated. The EPOD can be considered as
the LSE of the spatial modes σB A, i φ B A,i , given the spatial modes φ Ai . All the
properties of the LSE can therefore be extended to the EPOD. In particular, it must
be noted that, while the relation between velocity fields and other quantities may
be formally nonlinear, the LSE may still prove to be adequate due to the small
magnitude of second-order terms, as shown by Adrian et al. (1989) for the case
of homogeneous turbulence. Several attempts to determinate the velocity modes
correlated to the POD modes of other quantities have been reported in the literature.
Two examples are reported in the following: the first making use of the extended POD
to reconstruct time-resolved fields from time resolved hot-wire measurements and
non-time-resolved PIV measurements and the second one that analyzes the thermal
transport in a pipe with nonhomogeneous heating.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
196 Data-Driven Fluid Mechanics Chapter 9
y H (t) allows us to estimate the vector containing the time coefficients of the hot-wire
spatial modes, ψ H (t), through a simple projection:
ψ H (t) = Σ−1 ∗
H Φ H y H (t). (9.15)
Given ψ H (t), it is possible to estimate the time coefficients of the PIV modes. The
vector ψ P (t)+ , containing the estimated time coefficients of all the PIV spatial modes,
is obtained following (9.16), and the PIV snapshot can be reconstructed as in (9.17):
The reader might notice that the success of such a procedure would require
a significant number of hot-wire probes in order to avoid a significant difference
between the dimensionality of PIV and hot-wire snapshot matrices. Even if some
works in the literature have employed large amounts of probes (Kerhervé et al. 2017),
this is not easily available. The most common approach passes through the definition
of pseudo-snapshots a Hi in which several measurements are taken at the same
“instant” as for PIV adding “virtual” probes. Virtual probes are obtained using time-
shifted probe data with the support of the Taylor hypothesis of uniform convection,
converting the hot-wire temporal information into spatial information.
It is worth noticing that the columns of both Ψ P and Ψ H are orthonormal vectors
and form two bases in the Rns vector space; consequently Ξ is also composed by
columns forming a basis. This implies that the process is energy preserving and that
the higher-order modes that might be mutually uncorrelated are taken into account
for the estimation in (9.16). Since all rows/columns of Ξ have unitary norm, if a
certain ith probe mode ( jth field mode) is uncorrelated with all the field modes (the
snapshot modes), the ith row ( jth column) of Ξ has to be composed of randomly
distributed elements, with unitary norm and zero mean, thus standard deviation equal
√
to 1/ nt . This reasoning allows us to filter the matrix Ξ removing all the elements
√
|Ξi, j | < 1/ nt (Discetti et al. 2018). This approach has allowed us to obtain a
reconstruction of the behavior of large-scale and very-large-scale motions high-Re
turbulent flows. An example of experiment is reported in Figure 9.5 in which PIV
measurments are synchronized with five hot-wire probes in the large-scale pipe-flow
facility CICLoPE (Univ. of Bologna, Italy). Figure 9.5, right-hand side, reports a
comparison between the PIV estimated fields and the hot-wire measurements that
shows a significant coherence and low noise level, despite the significant spectral
richness of the flow (Discetti et al. 2019).
As a further extension of this work, aiming at the identification of a suitable sensing
tool for the detection of large-scale and very-large-scale motions in a turbulent wall-
bounded flow, we have recently explored the performances of convolutional neural
networks (Güemes et al. 2019, Guastoni et al. 2020), obtaining better results than
EPOD. This highlights that machine learning tools have the capability to enhance and
eventually replace traditional data analysis tools in the field of fluid mechanics.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 197
Figure 9.5 Left: Image of the experimental arrangement with PIV image plane and hot-wire
rake located downstream. Right: Contour of the evolution of the streamwise velocity
component obtained using Taylor’s hypothesis, Reτ = 9500. Top: hot-wire data. Bottom:
Estimated fields. The velocity contours are shown in inner units, that is, u+ = u/uτ . (Reprinted
from Discetti et al. (2019), copyright 2019, with permission from Elsevier.)
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
198 Data-Driven Fluid Mechanics Chapter 9
variance. Looking at the cumulative sum of the variance, 80% of the temperature
variance is correlated with only 40% of the velocity variance. The fact that the
turbulent thermal transport is ascribed only to a limited amount of the velocity variance
suggests possible approaches for heat transfer enhancement, such as the use of vortex
generators able to produce a flow pattern similar to the first extended velocity mode
(see Figure 9.6, bottom) to improve the temperature uniformity in the pipe section.
Figure 9.6 Results of the EPOD analysis in a turbulent pipe with nonhomogeneous heat flux.
Top: Variance content (cumulative variance content in the inset) of temperature modes (blue)
and velocity extended modes (red). Bottom left: First temperature mode. Bottom Right: First
extended velocity mode with contour of the streamwise velocity component and in-plane
velocity vectors.
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Ianiro: Good Practice and Applications 199
Once the correlation matrix has been computed, it is possible to estimate the
temperature modes and the extended velocity modes for u, v, and w based on the
temperature temporal modes ΨT :
% building snapshot matrices
nzplanes = size (T ,3) ;
T = reshape (T , nr * ntheta , ns ) ;
u = reshape (u , nr * ntheta , ns ) ;
v = reshape (v , nr * ntheta , ns ) ;
w = reshape (w , nr * ntheta , ns ) ;
% SVD of the temperature covariance matrix
[ Psi_T lmd_T Psistar_T ]= svd ( C_T ) ;
% Calculating spatial modes
SigmaPhi_T =( T ) * Psi_T ;
SigmaPhi_x_T =( u ) * Psi_T ;
SigmaPhi_y_T =( v ) * Psi_T ;
SigmaPhi_z_T =( w ) * Psi_T ;
To calculate the variance content σi2 of each mode φi , it is again necessary to take
into account the different flow areas corresponding to each point:
function sigma = get_sigma ( sigmaphi ,r , nr , nt )
sigmaphi = reshape ( sigmaphi , nr , ntheta ) ;
dr = diff ( r ) ;
weight = 0* r ;
weight (1) = dr (1) /2;
weight (2: nr -1) = ( dr (1: nr -2) + dr (2: nr -1) ) /2;
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
200 Data-Driven Fluid Mechanics Chapter 9
h d i 0 0 9 8 088962 4 0 4 bli h d li b C b id i i
Part IV
Dynamical Systems
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
10 Linear Dynamical Systems
and Control
S. Dawson
This chapter introduces a suite of ideas and tools from linear control theory, which
can be used to modify the behavior of linear dynamical systems using feedback
control. We use a simple example problem to motivate these methods and explicitly
demonstrate how they can be applied. The system that we focus on is a linearization
of unstable fluid flow over a bluff body. This system has already been discussed in
Chapter 1, and here we start with the same data set used in Chapter 6 as an example
for computing the proper orthogonal decomposition (POD). We start with a brief
review of linear systems in the time and frequency domain before introducing the
example system and embarking on a journey through different approaches to control
it. In particular, we introduce proportional, integral, and derivative (PID) control, pole
placement, and full-state linear quadratic regulator (LQR) optimal control. Due to the
breadth of topics covered, which might typically take one or several semester-long
courses to cover, we will not be able to provide a full theoretical background behind
each method but will focus on the main ideas and the practical implementation of these
methods. We also briefly discuss further extensions and a number of applications of
linear control theory.
Û = Ax(t),
x(t) (10.1)
where x(t) ∈ Rn is a column vector consisting of each state in the system, x(t) Û is the
time derivative of this quantity, and A is an n × n matrix, which we assume to have
real entries. Recall that this system is asymptotically stable if the eigenvalues of A
(also referred to as the poles of the system) are located in the left half of the complex
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
204 Data-Driven Fluid Mechanics Chapter 10
plane. Accounting for inputs u(t) and outputs y(t), the full system can be specified by
matrices (A, B, C, D), with
Û = Ax(t) + Bu(t),
x(t)
(10.2)
y(t) = C x(t) + Du(t).
The state-space system in (10.2) can be expressed as a transfer function in the Laplace
(frequency) domain by
P(s) = C(sI − A)−1 B + D. (10.3)
Y(s)
P(s) can also be interpreted as the ratio U(s) ,
where U(s) and Y (s) are the Laplace
transforms of the input, u(t) and output, y(t), respectively.1 The eigenvalues of A in
(10.2) correspond to the values of s where the term (sI − A)−1 in (10.3) (which is
known as the resolvent) is not defined, due to (sI − A) being non-invertible. Further
discussion of the stability of linear systems, and their analysis through the resolvent,
can be found in Chapter 13. A linear system of the form given by (10.2) or (10.3) is
the “plant” that we will ultimately be seeking to control. We can represent this system
and its inputs and outputs schematically as
u xÛ = Ax + Bu y
y = C x + Du
The properties of linear systems in both the time and frequency domain are discussed
in further detail in Chapter 4.
To illustrate the various concepts and ideas that will be covered in this lecture,
we will focus on a specific example, coming from a relatively simple system in
fluid mechanics. We will look at two-dimensional flow over a circular cylinder at a
Reynolds number of 60, particularly focusing on the dynamics of the system as it
evolves in time away from its unstable equilibrium solution, and toward its vortex
shedding limit cycle. The dynamics of this system are discussed in further detail in
Chapter 1. The specific data set that we use is the same one utilized in Chapter 6 for
an example for computing the POD.
To obtain a linear model of the form indicated in (10.1), we take data from a short
region in time, identify the two leading POD modes in this region (see Chapter 6
for a detailed discussion of the POD), and then identify a linear model that captures
the dynamics of the evolution of the corresponding POD coefficients during this short
region of time. This model identification is done via the dynamic mode decomposition
algorithm, which is discussed in detail in Chapter 7. For additional details concerning
the implementation of this system identification method, as well as the various control
techniques that will be applied to this identified system, see the accompanying Python
code (available on the book’s website2 ). The leading two POD modes, along with
the evolution of their corresponding coefficients in time, are shown in Figures 10.1
1 In the case where there are multiple inputs or outputs, the components of G(s) are given by the ratios
between the Laplace transforms of each pair of inputs and outputs.
2 www.datadrivenfluidmechanics.com/dow nload/book/chapter10.zip
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 205
and 10.2, respectively. As well as showing the true evolution of the POD coefficients
(which will be our system states x1 and x2 ), we also show the evolution predicted by
the identified linear model. The linear model accurately captures the true behavior over
a region in time substantially larger than that used for system identification, though
is unable to capture the eventual saturation of the coefficient amplitudes as the true
system converges toward the limit cycle. Note that the portion of the data used for
system identification is the same as that used to compute POD for the example in
Chapter 6. Using feedback control, we might hope to be able to keep the oscillations
seen in Figure 10.2 to small enough amplitudes such that the linear model remains
accurate, and that controllers designed using this linear model remain effective.
For this system, we obtain the linearized dynamics
0.0438 −0.7439
A= . (10.4)
0.7373 0.0527
This system has two unstable eigenvalues at 0.0483 ± 0.7406i, which is expected,
since the envelope of the POD coefficient amplitudes plotted in Figure 10.2 is growing
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
206 Data-Driven Fluid Mechanics Chapter 10
in time. The fact that these eigenvalues come as a complex conjugate pair is also
expected, since the dynamics of the system are oscillatory.
Note that this linear system is a simplification of the true dynamics not just
because we are using a linear model, but also because we are neglecting other states
of the system, corresponding to lower-energy, truncated POD modes. For example,
identifying a linear system with more states could model (at least in a linear sense) the
slow modification of the “mean” as the amplitude of the oscillations increases, as well
as dynamics with frequency content at harmonics of the primary oscillation frequency.
These dynamics are discussed in further detail in Chapter 14.
We will explore various choices for the B and C matrices in Sections 10.2–10.5, but
will typically choose to directly observe and control one of the system states. This also
represents a simplification of what measurement and control might look like for the
true system in practice, where localized sensors and actuators might be more feasible
than directly measuring or controlling an entire POD mode.
Suppose now that we want to manipulate our system to achieve a desired output, y(t).
For the example described, perhaps we want to stabilize the unstable system, and thus
eventually drive all components of y(t) to zero. For an unstable system such as this,
this objective is impossible without the availability of real-time information about
the state of the system, and how it is responding to attempts at control. It would be
equivalent, for example, trying to balance an inverted pendulum without any visual or
tactile measurements.
What we want to do then, is to develop a means for utilizing knowledge of the
system output y(t) in order to determine appropriate system inputs to achieve the
desired control objective. This general concept is known as feedback control. For
example, we could link the output to the input using the following arrangement:
r xÛ = Ax + Bu y
Σ
− y = C x + Du
Here we refer to K as the controller, which could simply be a constant, or might have
its own internal dynamics.
In this section, we will assume that we have a single input to our system (so B has
one column), and a single output (so C has one row), and will always assume that there
is no direct feedthrough from the input to the output (so D = 0). It will be helpful to
look at our system in the frequency domain, which we can obtain from a state-space
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 207
−0.2 Uncontrolled
K = 1 feedback, P1 (s)
−0.4 K = 1 feedback, P2 (s)
Control applied
−0.6
0 10 20 30 40 50
t
system using (10.3). We will explore two different input and output combinations.
First, we have the case where the input uses a different state variable as the output
(system P1 ), using the input and output matrices
1
B1 = , C1 = 0 1 , (10.5)
0
giving the transfer function (see (10.3))
0.7373
P1 (s) = . (10.6)
s2
− 0.09654s + 0.5508
Second, we use the input and output matrices
0
B2 = , C2 = 0 1 , (10.7)
1
giving the transfer function
s − 0.04381
P2 (s) = . (10.8)
s2 − 0.09654s + 0.5508
Note that the denominators of the transfer functions P1 (s) and P2 (s) are the same
quadratic polynomial. This is as expected, since the roots of this polynomial are the
poles of the system, which are in turn given by the eigenvalues of A, which is the same
for both systems.
To begin with, and before doing any further analysis of this system, suppose that
we try to control our systems with the simplest feedback control, where K from our
schematic diagram being a multiplicative constant. This is known as proportional con-
trol. The results of applying this control to our system with K = 1 is shown in Figure
10.3, where we have the reference set at r = 0. Here, we allow the system to evolve
without any inputs, before turning on control at a time t = 15. Note that here we are
applying control to the unstable linearized system, rather than the full nonlinear sys-
tem. We notice that the system P2 seems to be stabilized, while P1 remains unstable.
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
208 Data-Driven Fluid Mechanics Chapter 10
How might we have predicted this observed behavior? With r = 0, the system input
is given by law u = −K y = −KC x. This means that the system with feedback evolves
according to
xÛ = (A − BKC)x. (10.9)
The stability of this system is determined by the eigenvalues of the matrix A − BKC.
In the frequency domain, it is easy to show that the transfer function of the controlled
system is given by
P
GC L = , (10.10)
1 + PK
where K is the transfer function for the controller in the frequency domain (which is
a constant for proportional control). Note that more generally, a closed-loop transfer
function is given by
GOL (s)
GC L = , (10.11)
1 + G F B (s)
where GOL denotes the open-loop transfer function between r and y, and G F B is the
transfer function going around the feedback loop. Going back to (10.10), here the
poles of the closed-loop system will be the solutions to
1 + PK = 0, (10.12)
which can be expressed as the roots of a characteristic polynomial. To study the
systems P1 and P2 in more detail, consider a more general system of the form
a1 s + a0
P(s) = 2 , (10.13)
s + b1 s + b0
for which P1 (s) and P2 (s) are both special cases. From (10.12), the closed-loop poles
of this system for control with a constant K can be found to be the solutions to the
polynomial
s2 + (b1 + a1 K)s + (b0 + a0 K) = 0. (10.14)
These solutions are given by
(b1 + K)2 − 4(b0 + a0 K)
p
b1 + a1 K
s=− ± . (10.15)
2 2
From this, we can start to understand why proportional control with K = 1 worked for
the system P2 , but not for P1 . The fact that P2 has a nonzero a1 term means that the
control is able to shift the real component of the poles by −a1 K/2, thus making them
more stable (assuming that a1 > 0). Note also that when the argument of the square
root is negative, this term will not affect the real component of the poles. Conversely,
if a1 = 0 as is the case with P1 , the control only affects the square root term in
(10.15). If the argument of the square root is negative, then this does not change
the real component, and thus the stability, of the system. On the other hand, if the
argument of the square root is positive, then one pole will become more stable, but the
other becomes more unstable as the feedback gain either increases or decreases. From
this, we can conclude that proportional feedback is unable to stabilize P1 .
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 209
As an aside, note that the dynamics of this system can be related to those of a spring-
mass/damper system, which also has a quadratic denominator in its transfer function.
In this analolgy, the negative coefficient of the linear term in the denominator of P1
and P2 corresponds to a negative damping, which is what leads to the system being
unstable. Furthermore, (10.14) shows that K can only affect the damping term when
a is nonzero, and otherwise only affects the stiffness-to-mass ratio, giving another
interpretation for why this proportional control cannot stabilize P1 .
What sort of feedback would work for P1 ? From (10.12), we see that we could
achieve the same behavior of the poles of the closed-loop system if we added a term
to our controller to mimic the first-order term in the numerator of P2 . For example, we
could let
K = a1 s + a0 , (10.16)
ki
K(s) = k p + + k p s. (10.17)
s
Note that adding an integrator can also destabilize the system, so care must be taken
when choosing the gains k p , ki , and k d used in PID control.
The analysis that we have performed in this section is tractable for this relatively
simple system, but quickly becomes unmanageable for more complex systems with
higher order numerators and denominators in their transfer functions. This motivates
the formulation of general rules to predict the effect of feedback controllers without
requiring us to explicitly solve for the roots of high-order polynomial equations. Such
rules will be discussed in Section 10.3.
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
210 Data-Driven Fluid Mechanics Chapter 10
This section introduces a tool that will allow us to predict in advance what the effect of
a given control strategy will be. In particular, it turns out that there are several simple
rules that can allow us to sketch where the poles and zeros go as the controller gain
increases. The root locus (developed by Evans (1948)) is a tool to see graphically how
the poles of a closed-loop system move as a parameter k (which most typically is a
controller gain) is varied. In particular, the root locus plot shows the locations of all
poles of a system as a function of k, where we can write the characteristic equation
for the poles of the closed-loop system as
1 + kG(s) = 0. (10.18)
In the case of proportional control, G(s) is simply the plant P(s) and k is the
proportional controller gain, though note that root locus plots can be constructed for
other forms of control, so long as the closed-loop poles can be expressed in the form
given by (10.18). Note in particular the similarity between (10.18) and (10.12). While
it is easy to make a root locus plot in MATLAB or Python, it is also possible to draw
these plots relatively accurately by hand, just by following a few simple rules (and
importantly, without needing to explicitly solve (10.18) by hand for all values of k.
We start by assuming that we know where the n poles (denoted as p j ) and m zeros (z j )
of G are located. It is also important to know d = n − m, which is the relative degree
of G. From this, we have the following rules:
1. The branches of a root locus start at the poles of G (i.e., when no control is applied,
or equivalently where k = 0), and end either at zeros of the open-loop system, or at
infinity (with a total of d ending at ∞).
2. A point on the real axis is included in the root locus if there is an odd total number
of poles and zeros to the right of it.
3. As k goes to ∞, the d branches of the root locus that go to ∞ asymptote to straight
lines with equal angles between them, with those angles (from the positive real axis)
given for k = 0, 1, . . . , d − 1 by
π + 2kπ
θk =
. (10.19)
d
Moreover, the asymptotes originate at a common center, given by
n m
©Õ Õ
c0 = d −1 pj − zj ® . (10.20)
ª
« j=1 j=1 ¬
If d = 0, then all branches of the root locus end up at the zeros of the open-loop
system as k goes to ∞.
These rules typically give enough information to roughly sketch how the poles of the
closed-loop system will move as the controller gain varies. This can be particularly
useful for determining the best choice of control law for a given system. Note that
there are more complex rules that can enhance the accuracy of a hand-drawn root
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 211
0 0.0
locus plot, but we will not discuss them here. Root locus plots of the systems P1 (s)
and P2 (s) are shown in Figure 10.4. These plots are consistent with our findings in
Section 10.2, where it was found that proportional control could stabilize P2 (s) but
not P1 (s).
Note also that there are many other tools available to assist in designing controllers
for linear systems, such as Bode plots, Nyquist plots, and other auxiliary systems that
can help predict the behavior of the controlled system when it is subject to sensor
noise and disturbances. Details for these methods can be found in textbooks such as
Aström and Murray (2010) and Franklin et al. (1994).
So far, we have looked at a few different control strategies, and were fortunately able
to find an approach that succeeded in stabilizing both of our systems. It is natural to
wonder if there are cases where this is an entirely futile exercise, in the sense that no
possible control strategy could work, given the ways in which we can manipulate the
state of our system, x, through the input u. It turns out, there is a relatively simple way
to determine whether a system is controllable in this sense. This can be done through
the use of the controllability matrix
C= B A2 B An−1 B . (10.21)
AB ···
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
212 Data-Driven Fluid Mechanics Chapter 10
C
CA
2
O = CA . (10.22)
.
..
C An−1
We say that a system is observable if the observability matrix has rank n. O has n
columns and n × no rows, where no is the number of outputs. It is easy for us to
verify that the state-space versions of systems P1 and P2 are both controllable and
observable, and indeed this is the case for any choice of input and output channels
(i.e., for B = [0 1]T or [1 0]T , and C = [0 1] or [1 0]).
It turns out that we can also talk about controllability and observability not just as
binary notions, but can also study how controllable a system is through related objects
called the controllability and observability Gramians (Zhou & Doyle 1998), which are
discussed in Chapter 12.
Thus far, we have only considered control using the system output, which we have
taken to be a scalar quantity. For the following sections, it will be useful to consider the
case where we have access to (or somewhat equivalently, have a method of estimating)
the full state of the system x. This can be represented with the following control
diagram, which will be useful in the following sections:
r x y
Σ xÛ = Ax + Bu C
−
We now consider two different methods for controller design using full-state feedback:
pole placement in Section 10.5.1, and optimal control using LQR in Section 10.5.2.
u = −K x, (10.23)
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 213
xÛ = (A − BK)x, (10.24)
which is the same as (10.9) with C = I . If we have desired closed-loop poles, then this
can be achieved by finding the components of K such that A − BK has eigenvalues
at these locations. If the system is controllable, then this can always be achieved. For
example, if we wish to place poles at p = −0.5 ± 0.5i for our systems P1 or P2 with
full-state feedback, then we find that K = [1.0965 0.0095]T . Pole placement can be
computed using the command K = place(A,B,p) in either MATLAB or Python.
where Q and R are positive definite matrices that define the penalty associated
with nonzero states and control effort, respectively. A controller that is designed to
minimize (10.25) is called an LQR controller. Mathematically, this control law that
minimizes (10.25) is given by
K = R−1 BT M, (10.26)
where M is the unique positive definite matrix that satisfies the algebraic Riccati
equation
AT M + M A − M BR−1 BT M + Q = 0. (10.27)
This might look like a difficult problem to solve, but once again there are commands
in MATLAB and Python libraries that will output the gain matrix K that optimizes
(10.25). As well as being “optimal” in this sense, it turns out that the resulting
controller also satisfies certain robustness properties (e.g., Stengel 1994).
To test this method, we design LQR controllers for the system P1 , using Q = I
and three different choices of R (10, 1, and 0.1), which prescribe how aggressive
the resulting controller is in achieving the desired state. The characteristics of these
feedback systems are explored in Figures 10.5 and 10.6. We see in Figure 10.5 that all
three controllers stabilize the system, though the locations of the poles are noticeably
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
214 Data-Driven Fluid Mechanics Chapter 10
axis
axis
0.6 Uncontrolled
LQR, R = 100
0.4 LQR, R = 1
LQR, R = 0.1
0.2 Control applied Figure 10.6 Poles and system
response for implementing
y
0 10 20 30 40 50
t
different for each case. Roughly speaking, larger R values penalize large amounts of
controller work, which tends to result in closed-loop systems that are “less different”
from the initial uncontrolled system, which is consistent with what is observed.
Similarly, we see in Figure 10.6 that the larger R value results in a controlled system
that takes longer to converge to the desired steady state, while the more aggressive
controllers (with smaller R values) converge to this state more rapidly, though with
larger control effort (which is not shown explicitly).
As mentioned at the outset, there are numerous methods in linear control theory that
we have not touched on in this lecture. For a more comprehensive treatment of linear
control theory, standard references such as Aström and Murray (2010), Franklin et al.
(1994), and Skogestad and Postlethwaite (2007) are a good place to start, with the
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
Dawson: Linear Dynamical Systems and Control 215
latter in particular covering more advanced topics. In this final section of this chapter,
we briefly discuss additional ideas and techniques related to feedback control.
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
216 Data-Driven Fluid Mechanics Chapter 10
h d i 0 0 9 8 088962 4 0 6 bli h d li b C b id i i
11 Nonlinear Dynamical Systems
S. Brunton
11.1 Introduction
In recent decades, the field of dynamical systems has grown rapidly, with analytical
derivations and first principle models giving way to data-driven approaches. Thus,
machine learning and big data are driving a paradigm shift in the analysis and under-
standing of dynamical systems in science and engineering. This trend is particularly
evident in the field of fluid dynamics, which is one of the original big data fields.
In addition, the classical geometric and statistical perspectives on dynamical
systems are being complemented by a third operator-theoretic perspective, based
on the evolution of measurements of the system. This so-called Koopman operator
theory is poised to capitalize on the increasing availability of measurement data from
complex systems. Moreover, Koopman theory provides a path to identify intrinsic
coordinate systems to represent nonlinear dynamics in a linear framework. Obtaining
linear representations of strongly nonlinear systems has the potential to revolutionize
our ability to predict and control these systems (Figure 11.1). See Chapters 10 and 7
for overviews of linear systems and Koopman theory, respectively.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
218 Data-Driven Fluid Mechanics Chapter 11
Increasing nonlinearity
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 219
y
x
Figure 11.2 Chaotic trajectory of the Lorenz system from (11.2). From Brunton and Kutz
(2019), reproduced with permission of the Licensor through PLSclear.
Figure 11.3 Depiction of uncertainty in the Lorenz system. A cube of initial conditions is
evolved along the flow. Although the trajectories stay close together for some time, after a
while, they begin to spread along the attractor.
of the Lorenz system is shown in Figure 11.2, with parameters σ = 10, ρ = 28, and
β = 8/3.
The Lorenz system is among the simplest and most well-studied dynamical
systems that exhibit chaos, which is characterized as a sensitive dependence on
initial conditions. Two trajectories with nearby initial conditions will rapidly diverge
in behavior, and after long periods, only statistical statements can be made. This
sensitivity is depicted in Figure 11.3, where a cube of initial conditions is evolved
along the flow of the dynamical system, showing that after some time the initial
conditions become spread out along the attractor. This type of chaotic mixing is
characteristic of turbulence.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
220 Data-Driven Fluid Mechanics Chapter 11
1 3.45
3.5
1.5 3.55
3.6
2
3.65
β β 3.7
2.5
r
r
3.75
3.8
3
3.85
3.5 3.9
3.95
4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x x
x
Figure 11.4 Attracting sets of the logistic map for varying parameter β. From Brunton and
Kutz (2019), reproduced with permission of the Licensor through PLSclear.
We will often consider the simpler case of an autonomous system without time
dependence or parameters:
d
x(t) = f(x(t)). (11.3)
dt
Discrete-time dynamics are often more natural when considering experimental data
and simulations, as data is typically generated discretely in time.
Also known as a map, the discrete-time dynamics are more general than the
continuous-time formulation in (11.3), encompassing discontinuous and hybrid sys-
tems as well.
As an example, consider the logistic map, which is a simple model to explore the
complexity of population dynamics:
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 221
∫ t0 +t
Ft (x(t0 )) = x(t0 ) + f(x(τ)) dτ. (11.6)
t0
The dynamics are entirely characterized by the eigenvalues and eigenvectors of the
matrix A, given by the spectral decomposition (eigendecomposition) of A:
AT = TΛ. (11.9)
More generally, in the case of repeated eigenvalues, the matrix Λ will consist of Jordan
blocks (Perko 2013). Note that the continuous-time system gives rise to a discrete-time
dynamical system, with Ft given by the solution map exp(At) in (11.8). In this case,
the discrete-time eigenvalues are given by eλt .
For non-normal operators A, the eigenvectors may be nearly parallel. In this case,
even with stable eigenvalues, the system may exhibit transient growth, where an initial
condition first grows before eventually converging to the origin. Transient growth
is observed in many shear flows, and the excursion from the origin often results
in perturbation amplitudes that are large enough to excite nonlinear dynamics. For
example, pipe flow is linearly stable at all Reynolds numbers, but non-normality and
transient energy growth results in sustained turbulence at finite Reynolds numbers.
The matrix T−1 defines a transformation, z = T−1 x, into intrinsic eigenvector
coordinates, z, where the dynamics become decoupled:
d
z = Λz. (11.11)
dt
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
222 Data-Driven Fluid Mechanics Chapter 11
y
0
1
0
-1 x
-0.2 0. 0.2 -1
β 0.4 0.6
Figure 11.5 Schematic of a Hopf bifurcation. Adapted from Brunton, Proctor and Kutz
(2016a).
In other words, each coordinate, z j , only depends on itself, with dynamics given by
d
zj = λj zj . (11.12)
dt
Thus, it is highly desirable to work with linear systems, since it is possible to
easily transform the system into eigenvector coordinates where the dynamics become
decoupled. No such closed-form solution or simple linear change of coordinates exists
in general for nonlinear systems, motivating many of the directions described in this
chapter.
11.2.4 Bifurcations
We have already seen bifurcations that occur in the logistic map. Bifurcations are par-
ticularly important in fluid dynamic systems. Small changes in physical parameters,
such as the Reynolds number, may drive qualitative changes in the behavior of the
resulting dynamics. For example, in the flow past a stationary circular cylinder, when
the Reynolds number increases past 47, the flow goes from a steady laminar solution
to unsteady, periodic vortex shedding (see Chapter 1). This Hopf bifurcation is shown
in Figure 11.5, and is given by the following dynamical system:
dx
= βx − ωy − Ax x 2 + y 2 , (11.13a)
dt
dy
= ωx + βy − Ay x 2 + y 2 . (11.13b)
dt
As β goes from negative to positive values, a single stable fixed point at the origin
becomes unstable, and a stable limit cycle emerges.
As a simpler example, we often consider the one-dimensional pitchfork bifurcation,
shown in Figure 11.6 and given by the following dynamical system:
dx
= βx − x 3 . (11.14)
dt
For negative values of the parameter β, there is a single stable fixed point at the
origin. As this parameter increases from negative to positive, this fixed point becomes
√
unstable and two other stable fixed points emerge at ± β.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 223
It is interesting to note that if we write the Hopf normal form in (11.13) in polar
coordinates, with r 2 = x 2 + y 2 , then the equation for the amplitude r reduces to the
pitchfork normal form in (11.14).
There are a number of goals and uses of dynamical systems models of fluid flows:
1. Future state prediction. In many cases, we seek predictions of the future state
of a system. Long-time predictions may still be challenging, especially for chaotic
systems, such as turbulence, where long-term statistics are a more reasonable goal.
2. Design and optimization. The dynamical system may be used as a surrogate model
to tune the parameters of a system for improved performance or stability. For
example, aircraft and automobile designers modify geometry and control surfaces
to improve aerodynamic performance.
3. Estimation and control. For many fluid systems, an ultimate goal is to actively
control the system through feedback, using measurements of the system to inform
actuation to modify the behavior. For high-dimensional systems, such as fluids, it is
often necessary to estimate the full state of the system from limited measurements.
4. Interpretability and physical understanding. Perhaps a more fundamental goal of
dynamical systems modeling is to provide physical insight and interpretability into
a system’s behavior through analyzing trajectories and solutions to the governing
equations of motion. Dynamical systems models of fluids will ideally yield some
insight into fundamental mechanisms that drive the flow.
These goals are often challenged by the multiscale and nonlinear nature of fluids.
There is also uncertainty in the boundary conditions, parameters, and measurements
of the system. These challenges may be summarized as follows:
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
224 Data-Driven Fluid Mechanics Chapter 11
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 225
Kt g = g ◦ Ft , (11.15)
where ◦ is the composition operator. For a discrete-time system with time step ∆t, this
becomes
Note that this is true for any observable function g and for any state xk .
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
226 Data-Driven Fluid Mechanics Chapter 11
The Koopman operator is linear, a property that is inherited from the linearity of
the addition operation in function spaces:
Kt (α1 g1 (x) + α2 g2 (x)) = α1 g1 (Ft (x)) + α2 g2 (Ft (x)) (11.18a)
= α1 Kt g1 (x) + α2 Kt g2 (x). (11.18b)
For sufficiently smooth dynamical systems, it is also possible to define the
continuous-time analogue of the Koopman dynamical system in (11.17):
d
g = Kg. (11.19)
dt
The operator K is the infinitesimal generator of the one-parameter family of trans-
formations Kt (Abraham et al. 1988). It is defined by its action on an observable
function g:
Kt g − g g ◦ Ft − g
Kg = lim = lim . (11.20)
t→0 t t→0 t
The linear dynamical systems in (11.19) and (11.17) are analogous to the dynamical
systems in (11.3) and (11.4), respectively. It is important to note that the original state
x may be the observable, and the infinite-dimensional operator Kt will advance this
function. However, the simple representation of the observable g = x in a chosen basis
for Hilbert space may become arbitrarily complex once iterated through the dynamics.
In other words, finding a representation for K x may not be simple or straightforward.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 227
Combined with (11.22), this results in a partial differential equation (PDE) for the
eigenfunction ϕ(x):
∇ϕ(x) · f(x) = λϕ(x). (11.24)
Eigenvalue Lattices
Interestingly, a set of Koopman eigenfunctions may be used to generate more
eigenfunctions. In discrete time, we find that the product of two eigenfunctions ϕ1 (x)
and ϕ2 (x) is also an eigenfunction
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
228 Data-Driven Fluid Mechanics Chapter 11
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 229
g = α1 g1 + α2 g2 + · · · + αp g p (11.34)
Kg = β1 g1 + β2 g2 + · · · + β p g p . (11.35)
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
230 Data-Driven Fluid Mechanics Chapter 11
Any finite set of eigenfunctions of the Koopman operator will span an invari-
ant subspace. Discovering these eigenfunction coordinates is, therefore, a central
challenge, as they provide intrinsic coordinates along which the dynamics behave
linearly. In practice, it is more likely that we will identify an approximately invariant
p
subspace, given by a set of functions {g j } j=0 , where each of the functions g j is well
Íp
approximated by a finite sum of eigenfunctions: g j ≈ k=0 αk ϕk .
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
Brunton: Nonlinear Dynamical Systems 231
Figure 11.7 Visualization of three-dimensional linear Koopman system from (11.37a) along
with projection of dynamics onto the x1 –x2 plane. The attracting slow manifold is shown in
red, the constraint y3 = y12 is shown in blue, and the slow unstable subspace of (11.37a) is
shown in green. Black trajectories of the linear Koopman system in y project onto trajectories
of the full nonlinear system in x in the y1 –y2 plane. Here, µ = −0.05 and λ = 1. Reproduced
from Brunton et al. (2016b).
In this way, a set of intrinsic coordinates may be determined from the observable
functions defined by the left eigenvectors of the Koopman operator on an invariant
subspace. Explicitly,
ϕα (x) = ξα y(x), where ξα K = αξα . (11.39)
These eigen-observables define observable subspaces that remain invariant under the
Koopman operator, even after coordinate transformations. As such, they may be
regarded as intrinsic coordinates (Williams et al. 2015) on the Koopman-invariant
subspace.
h d i 0 0 9 8 088962 4 0 bli h d li b C b id i i
12 Methods for System Identification
S. Brunton
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 233
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
234 Data-Driven Fluid Mechanics Chapter 12
u y
System
Figure 12.1 Input–output system. A control-oriented reduced-order model will capture the
transfer function from u to y.
The POD (Berkooz et al. 1993, Holmes et al. 2012) provides a transform matrix
Ψ, the columns of which are modes that are ordered based on energy content.1 POD
has been widely used to generate ROMs of complex systems, many for control, and it
is guaranteed to provide an optimal low-rank basis to capture the maximal energy or
variance in a data set. However, it may be the case that the most energetic modes are
nearly uncontrollable or unobservable, and therefore may not be relevant for control.
Similarly, in many cases the most controllable and observable state directions may
have very low energy; for example, acoustic modes typically have very low energy,
yet they mediate the dominant input–output dynamics in many fluid systems. The
rudder on a ship provides a good analogy: although it accounts for a small amount of
the total energy, it is dynamically important for control.
Instead of ordering modes based on energy, it is possible to determine a hierarchy of
modes that are most controllable and observable, therefore capturing the most input–
output information. These modes give rise to balanced models, giving equal weighting
to the controllability and observability of a state via a coordinate transformation
that makes the controllability and observability Gramians equal and diagonal. These
models have been extremely successful, although computing a balanced model using
traditional methods is prohibitively expensive for high-dimensional systems. In this
section, we describe the balancing procedure, as well as modern methods for efficient
computation of balanced models. A computationally efficient suite of algorithms for
model reduction and system identification may be found in Belson et al. (2014).
A balanced ROM should map inputs to outputs as faithfully as possible for a given
model order r. It is therefore important to introduce an operator norm to quantify how
similarly (12.1) and (12.2) act on a given set of inputs. Typically, we take the infinity
norm of the difference between the transfer functions G(s) and Gr (s) obtained from
the full system (12.1) and reduced system (12.2), respectively. This norm is given by
kGk∞ , max σ1 (G(iω)) . (12.4)
ω
1 When the training data consists of velocity fields, for example from a high-dimensional discretized fluid
system, then the singular values literally indicate the kinetic energy content of the associated mode. It is
common to refer to POD modes as being ordered by energy content, even in other applications,
although variance is more technically correct.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 235
x = Tz, (12.5)
that hierarchically orders the states in z in terms of their ability to capture the input–
output characteristics of the system. We will begin by considering an invertible
transformation T ∈ Rn×n , and then provide a method to compute just the first r
columns, which will comprise the transformation Ψ in (12.2). Thus, it will be possible
to retain only the first r most controllable/observable states, while truncating the
rest. This is similar to the change of variables into eigenvector coordinates, except
that we emphasize controllability and observability rather than characteristics of the
dynamics.
Substituting Tz into (12.1) gives
d
Tz = ATz + Bu, (12.6a)
dt
y = CTz + Du. (12.6b)
d
z = T−1 ATz + T−1 Bu, (12.7a)
dt
y = CTz + Du. (12.7b)
d
z = Âz + B̂u, (12.8a)
dt
y = Ĉz + Du, (12.8b)
where  = T−1 AT, B̂ = T−1 B, and Ĉ = CT. Note that when the columns of T are
orthonormal, the change of coordinates becomes
d
z = T∗ ATz + T∗ Bu, (12.9a)
dt
y = CTz + Du. (12.9b)
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
236 Data-Driven Fluid Mechanics Chapter 12
∫ ∞
eÂτ B̂B̂∗ e τ dτ,
∗
Ŵc = (12.10a)
∫0 ∞
−1 ATτ ∗ A∗ T−∗ τ
= eT T−1 BB∗ T−∗ eT dτ, (12.10b)
∫0 ∞
T−1 eAτ TT−1 BB∗ T−∗ T∗ eA τ T−∗ dτ,
∗
= (12.10c)
0
∫ ∞
τ
= T−1 eAτ BB∗ eA dτ T−∗, (12.10d)
0
= T−1 Wc T−∗ . (12.10e)
∗
Note that here we introduce T−∗ := T−1 = (T∗ )−1 . The observability Gramian
transforms similarly
Ŵo = T∗ Wo T, (12.11)
which is an exercise for the reader. Both Gramians transform as tensors (i.e., in terms
of the transform matrix T and its transpose, rather than T and its inverse), which is
consistent with them inducing an inner product on state-space.
Simple Rescaling
This example, modified from Moore (1981), demonstrates the ability to balance a
system through a change of coordinates. Consider the system
d x1 0 x1 10
−3
−1
= + u, (12.12a)
dt x2 0 −10 x2 103
x1
y = 103 10−3 . (12.12b)
x2
In this example, the first state x1 is barely controllable, while the second state is barely
observable. However, under the change of coordinates z1 = 103 x1 and z2 = 10−3 x2 ,
the system becomes balanced:
d z1 0 z1 1
−1
= + u, (12.13a)
dt z2 0 −10 z2 1
z1
y= 1 1 . (12.13b)
z2
In this example, the coordinate change simply rescales the state x. For instance,
it may be that the first state had units of millimeters while the second state had
units of kilometers. Writing both states in meters balances the dynamics; that is, the
controllability and observability Gramians are equal and diagonal.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 237
First, consider the product of the Gramians from (12.10) and (12.11):
η u Wc ηu∗ = σc , (12.17a)
ξu∗ Wo ξu = σo . (12.17b)
The first element of the diagonalized controllability Gramian is thus σc , while the first
element of the diagonalized observability Gramian is σo . If we scale the eigenvector
ξu by σs , then the inverse eigenvector η u is scaled by σs−1 . Transforming via the new
scaled eigenvectors ξ s = σs ξu and η s = σs−1 ηu , yields
η s Wc η s∗ = σs−2 σc , (12.18a)
ξ s∗ Wo ξ s = σs2 σo . (12.18b)
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
238 Data-Driven Fluid Mechanics Chapter 12
most computational software so that the columns of Tu have unit norm. Then both
Gramians are diagonalized, but are not necessarily equal:
u Wc Tu = Σ c ,
T−1 −∗
(12.20a)
T∗u Wo Tu = Σo . (12.20b)
The scaling that exactly balances these Gramians is then given by Σ s = Σ1/4
c Σo
−1/4
.
Thus, the exact balancing transformation is given by
T = Tu Σ s . (12.21)
It is possible to directly confirm that this transformation balances the Gramians:
1/2 1/2
(Tu Σ s )−1 Wc (Tu Σ s )−∗ = Σ−1
s Tu Wc Tu Σ s = Σ s Σ c Σ s = Σ c Σ o ,
−1 −∗ −1 −1 −1
(12.22a)
(Tu Σ s )∗ Wo (Tu Σ s ) = Σ s T∗u Wo Tu Σ s = Σ s Σ o Σ s = Σ1/2 1/2
c Σo . (12.22b)
The manipulations rely on the fact that diagonal matrices commute, so that Σ c Σ o =
Σ o Σ c , etc.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 239
Wo
Wc
k xk = 1 Wb
x2
x1
Figure 12.2 Illustration of balancing transformation on Gramians. The reachable set with unit
1/2
control input is shown in red, given by Wc x for k x k = 1. The corresponding observable set
is shown in blue. Under the balancing transformation T, the Gramians are equal, shown in
purple. From Brunton and Kutz (2019), reproduced with permission of the Licensor through
PLSclear.
and observability. It may be possible to truncate these coordinates and keep only the
most controllable/observable directions, resulting in a ROM that faithfully captures
input–output dynamics.
Given the new coordinates z = T−1 x ∈ Rn , it is possible to define a reduced-order
state x̃ ∈ Rr as
z1
.
..
x̃
z
z = r (12.23)
zr+1
..
.
zn
in terms of the first r most controllable and observable directions. If we partition the
balancing transformation T and inverse transformation S = T−1 into the first r modes
to be retained and the last n − r modes to be truncated,
Φ∗
T= Ψ Tt , S= , (12.24)
St
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
240 Data-Driven Fluid Mechanics Chapter 12
d Φ∗ AΨ Φ∗ ATt
∗
x̃ x̃ ΦB
= + u, (12.25a)
dt zt St AΨ St ATt zt St B
x̃
y= + Du. (12.25b)
CΨ CTt
zt
In balanced truncation, the state z t is simply truncated (i.e., discarded and set equal to
zero), and only the x̃ equations remain:
d
x̃ = Φ∗ AΨx̃ + Φ∗ Bu, (12.26a)
dt
y = CΨx̃ + Du. (12.26b)
Only the first r columns of T and S∗ = T−∗ are required to construct Ψ and Φ, and
thus computing the entire balancing transformation T is unnecessary. The computation
of Ψ and Φ without T will be discussed in the following sections. A key benefit of
balanced truncation is the existence of upper and lower bounds on the error of a given
order truncation:
n
Õ
Upper bound: kG − Gr k∞ ≤ 2 σj , (12.27a)
j=r+1
where σj is the jth diagonal entry of the balanced Gramians. The diagonal entries of
Σ are also known as Hankel singular values.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 241
Empirical Gramians
In practice, computing Gramians via the Lyapunov equation is computationally
expensive, with computational complexity of O(n3 ). Instead, the Gramians may
be approximated by full-state measurements of the discrete-time direct and adjoint
systems:
Equation (12.28a) is the discrete-time dynamic update equation, and (12.28b) is the
adjoint equation. The matrices Ad , Bd , and Cd are the discrete-time system matrices.
Note that the adjoint equation is generally nonphysical, and must be simulated;
thus the methods here apply to analytical equations and simulations, but not to
experimental data. An alternative formulation that does not rely on adjoint data, and
therefore generalizes to experiments, will be provided in Section 12.2.
Computing the impulse response of the direct and adjoint systems yields the
following discrete-time snapshot matrices:
Cd
h i Cd Ad
C d = Bd Ad Bd ··· Am c −1
Bd Od = .. . (12.29)
d
.
C Amo −1
d d
Wc ≈ Wec = C d C ∗d , (12.30a)
Wo ≈ Weo = O∗d O d . (12.30b)
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
242 Data-Driven Fluid Mechanics Chapter 12
Balanced POD
Instead of computing the eigendecomposition of Wc Wo , which is an n × n matrix, it
is possible to compute the balancing transformation via the SVD of the product of the
snapshot matrices,
Od C d, (12.31)
reminiscent of the method of snapshots (Sirovich 1987). This is the approach taken by
Rowley (2005).
First, define the generalized Hankel matrix as the product of the adjoint (O d ) and
direct (C d ) snapshot matrices from (12.29), for the discrete-time system:
Cd
Cd Ad h i
H = Od C d = .. B d A d B d · · · Am c −1
Bd (12.32a)
d
.
C Amo −1
d d
c −1
C B
d d Cd Ad Bd · · · C d Amd
Bd
Cd Ad Bd Cd A2d Bd · · · C d Am c
Bd
d
= .. .. .. .. . (12.32b)
. . . .
Cd Amo −1 Bd Cd Amo Bd · · · Cd Amc +mo −2 Bd
d d d
Next, we factor H using the SVD:
Ṽ∗
Σ̃ 0
H = UΣV = Ũ ∗
≈ ŨΣ̃Ṽ∗ . (12.33)
Ut
0 Σt V∗t
For a given desired model order r n, only the first r columns of U and V are
retained, along with the first r ×r block of Σ; the remaining contribution from Ut Σt V∗t
may be truncated. This yields a bi-orthogonal set of modes given by
−1/2
Direct modes: Ψ = C d ṼΣ̃ , (12.34a)
−1/2
Adjoint modes: Φ = O∗d ŨΣ̃ . (12.34b)
The direct modes Ψ ∈ Rn×r and adjoint modes Φ ∈ Rn×r are bi-orthogonal, Φ∗ Ψ =
Ir×r , and Rowley (2005) showed that they establish the change of coordinates that
balance the truncated empirical Gramians. Thus, Ψ approximates the first r-columns
of the full n × n balancing transformation, T, and Φ∗ approximates the first r-rows of
the n × n inverse balancing transformation, S = T−1 .
Now, it is possible to project the original system onto these modes, yielding a
balanced ROM of order r:
à = Φ∗ Ad Ψ, (12.35a)
B̃ = Φ Bd ,
∗
(12.35b)
C̃ = Cd Ψ. (12.35c)
It is possible to compute the reduced system dynamics in (12.35a) without having
direct access to Ad . In some cases, Ad may be exceedingly large and unwieldy, and
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 243
instead it is only possible to evaluate the action of this matrix on an input vector.
For example, in many modern fluid dynamics codes the matrix Ad is not actually
represented, but because it is sparse, it is possible to implement efficient routines to
multiply this matrix by a vector.
It is important to note that the ROM in (12.35) is formulated in discrete time, as it is
based on discrete-time empirical snapshot matrices. However, it is simple to obtain the
corresponding continuous-time system. In this example, D is the same in continuous
time and discrete time, and in the full-order and ROMs.
Note that a BPOD model may not exactly satisfy the upper bound from balanced
truncation (see (12.27)) due to errors in the empirical Gramians.
Output Projection
Often, in high-dimensional simulations, we assume full-state measurements, so that
p = n is exceedingly large. To avoid computing p = n adjoint simulations, it is possible
instead to solve an output-projected adjoint equation (Rowley 2005),
where Ũ is a matrix containing the first r singular vectors of C d . Thus, we first identify
a low-dimensional POD subspace Ũ from a direct impulse response, and then only
perform adjoint impulse response simulations by exciting these few POD coefficient
measurements. More generally, if y is high-dimensional but does not measure the full-
state, it is possible to use a POD subspace trained on the measurements, given by the
first r singular vectors Ũ of Cd C d . Adjoint impulse responses may then be performed
in these output POD directions.
Historical Note
The balanced POD method described earlier originated with the seminal work of
Moore (1981), which provided a data-driven generalization of the minimal realization
theory of Ho and Kalman (1965). Until then, minimal realizations were defined in
terms of idealized controllable and observable subspaces, which neglected the subtlety
of degrees of controllability and observability.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
244 Data-Driven Fluid Mechanics Chapter 12
Moore’s paper introduced a number of critical concepts that bridged the gap from
theory to reality. First, he established a connection between principal component
analysis (PCA) and Gramians, showing that information about degrees of control-
lability and observability may be mined from data via the SVD. Next, Moore showed
that a balancing transformation exists that makes the Gramians equal, diagonal, and
hierarchically ordered by balanced controllability and observability; moreover, he
provides an algorithm to compute this transformation. This sets the stage for principled
model reduction, whereby states may be truncated based on their joint controllability
and observability. Moore further introduced the notion of an empirical Gramian,
although he didn’t use this terminology. He also realized that computing Wc and
Wo directly is less accurate than computing the SVD of the empirical snapshot
matrices from the direct and adjoint systems, and he avoided directly computing the
eigendecomposition of Wc Wo by using these SVD transformations. Lall et al. (2002)
generalized this theory to nonlinear systems.
One drawback of Moore’s approach is that he computed the entire n × n balancing
transformation, which is not suitable for exceedingly high-dimensional systems.
Willcox and Peraire (2002) generalized the method to high-dimensional systems,
introducing a variant based on the rank-r decompositions of Wc and Wo obtained
from the direct and adjoint snapshot matrices. It is then possible to compute the
eigendecomposition of Wc Wo using efficient eigenvalue solvers without ever actually
writing down the full n × n matrices. However, this approach has the drawback of
requiring as many adjoint impulse response simulations as the number of output
equations, which may be exceedingly large for full-state measurements. Rowley
(2005) addressed this issue by introducing the output projection, discussed earlier,
which limits the number of adjoint simulations to the number of relevant POD modes
in the data. He also showed that it is possible to use the eigendecomposition of the
product O d C d The product O d C d is often smaller, and these computations may be
more accurate.
It is interesting to note that a nearly equivalent formulation was developed 20
years earlier in the field of system identification. The so-called ERA, introduced by
Juang and Pappa (1985), obtains equivalent balanced models without the need for
adjoint data, making it useful for system identification in experiments. This connection
between ERA and BPOD was established by Ma et al. (2011).
In contrast to model reduction, where the system model (A, B, C, D) was known,
system identification is purely data-driven. System identification may be thought of
as a form of machine learning, where an input–output map of a system is learned
from training data in a representation that generalizes to data that was not in the
training set. There is a vast literature on methods for system identification (Juang
1994, Ljung 1999), and many of the leading methods are based on a form of dynamic
regression that fits models based on data, such as the DMD. For this section, we
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 245
consider the ERA and observer-Kalman filter identification (OKID) methods because
of their connection to balanced model reduction (Moore 1981, Rowley 2005, Ma
et al. 2011, Tu et al. 2014) and their successful application in complex systems such
as vibration control of aerospace structures and closed-loop flow control (Bagheri,
Hoepffner, Schmid & Henningson 2009, Bagheri, Brandt & Henningson 2009,
Illingworth et al. 2010). The ERA/OKID procedure is also applicable to multiple-
input, multiple-output (MIMO) systems. Other methods include the autoregressive-
moving average (ARMA) and autoregressive moving average with exogenous inputs
(ARMAX) models (Whittle 1951, Box et al. 2015), the nonlinear autoregressive mov-
ing average with exogenous inputs (NARMAX) (Billings 2013) model, and the SINDy
method.
2 BPOD and ERA models both balance the empirical Gramians and approximate balanced
truncation (Moore 1981) for high-dimensional systems, given a sufficient volume of data.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
246 Data-Driven Fluid Mechanics Chapter 12
impulsive input will form the jth column of ykδ . Thus, each of the ykδ is a p × q matrix
CAk−1 B. Note that the system matrices (A, B, C, D) don’t actually need to exist, as the
following method is purely data-driven.
The Hankel matrix H from (12.32) is formed by stacking shifted time-series of
impulse response measurements into a matrix, as in the Hankel alternative view of the
Koopman (HAVOK) method (Brunton et al. 2017):
Ṽ∗
Σ̃ 0
H = UΣV = Ũ ∗
≈ ŨΣ̃Ṽ∗ . (12.41)
Ut
0 Σt V∗t
The small singular values in Σt are truncated, and only the first r singular values in Σ̃
are retained. The columns of Ũ and Ṽ are eigen-time-delay coordinates.
Until this point, the ERA algorithm closely resembles the BPOD procedure from
Section 12.1. However, we do not require direct access to O d and C d or the system
(A, B, C, D) to construct the direct and adjoint balancing transformations. Instead, with
sensor measurements from an impulse response experiment, it is also possible to create
a second, shifted Hankel matrix H0:
y2
y3δ ··· δ
ym c +1
yδ y4 δ
··· δ
ymc +2
3
H = .
0
.. .. .. (12.42a)
..
. . .
δ δ δ
mo +1 ymo +2 · · · ymc +mo
y
C A B
d d d Cd A2d Bd ··· C d Am
d
c
Bd
+1
Cd A2d Bd Cd A3d Bd m
··· Cd Ad c Bd
=
.. .. .. .. = O d AC d . (12.42b)
. . . .
m m +1
C d A o B d C d A o B d · · · C d A c o B d m +m −1
d d d
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 247
Based on the matrices H and H0, we are able to construct a ROM as follows:
−1/2 −1/2
à = Σ̃ Ũ∗ H0ṼΣ̃ , (12.43a)
1/2 I 0
B̃ = Σ̃ Ṽ∗ p , (12.43b)
0 0
Iq 0 1/2
C̃ = ŨΣ̃ . (12.43c)
0 0
Here I p is the p × p identity matrix, which extracts the first p columns, and Iq is the
q×q identity matrix, which extracts the first q rows. Thus, we express the input–output
dynamics in terms of a reduced system with a low-dimensional state x̃ ∈ Rr :
These modes may then be used to approximate the full-state of the high-dimensional
system from the low-dimensional model in (12.44) by
x ≈ Ψx̃. (12.46)
If enough data is collected when constructing the Hankel matrix H, then ERA
balances the empirical controllability and observability Gramians, O d O∗d and C ∗d C d .
However, if less data is collected, so that lightly damped transients do not have time to
decay, then ERA will only approximately balance the system. It is instead possible to
collect just enough data so that the Hankel matrix H reaches numerical full rank (i.e.,
so that remaining singular values are below a threshold tolerance), and compute an
ERA model. The resulting ERA model will typically have a relatively low order, given
by the numerical rank of the controllability and observability subspaces. It may then
be possible to apply exact balanced truncation to this smaller model, as is advocated
in Tu and Rowley (2012) and Luchtenburg and Rowley (2011).
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
248 Data-Driven Fluid Mechanics Chapter 12
1
u 2
u 0.5 u = δ(t)
0
-2
0
0 20 40 60 80 100
OKID 0 20 40 60 80 100
8
6 2
y 4
2 y 1
0
-2
-4 0
0 20 40 60 80 100 0 20 40 60 80 100
k k
Figure 12.3 Schematic overview of OKID procedure. The output of OKID is an impulse
response that can be used for system identification via ERA. From Brunton and Kutz (2019),
reproduced with permission of the Licensor through PLSclear.
amount of data must be collected to use ERA. This section poses the general problem
of approximating the impulse response from arbitrary input–output data (Figure 12.3).
Typically, one would identify ROMs according to the following general procedure:
The output yk in response to a general input signal uk , for zero initial condition
x0 = 0, is given by
y0 = D d u 0, (12.47a)
y1 = C d B d u 0 + D d u 1, (12.47b)
y2 = C d A d B d u 0 + C d B d u 1 + D d u 2, (12.47c)
···
d Bd u0 + C d Ad Bd u1 + · · · + Cd B d u k−1 + D d u k . (12.47d)
yk = Cd Ak−1 k−2
Note that there is no C term in the expression for y0 since there is zero initial condition
x0 = 0. This progression of measurements yk may be further simplified and expressed
in terms of impulse response measurements ykδ :
u0
u1 ··· u m
0 u0 ··· u m−1
ym = y0δ y1δ δ
.. . (12.48)
y0 y1 ··· ··· ym . .. ..
| {z } | {z } .. . . .
S Sδ ··· u0
0 0
| {z }
B
It is often possible to invert the matrix of control inputs, B, to solve for the Markov
parameters S δ . However, B may either be un-invertible, or inversion may be ill-
conditioned. In addition, B is large for lightly damped systems, making inversion
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 249
Recall from above that if the system is observable, it is possible to place the poles
of Ad − K f Cd anywhere we like. However, depending on the amount of noise in the
measurements, the magnitude of process noise, and uncertainty in our model, there
are optimal pole locations that are given by the Kalman filter. We may now solve for
δ
the observer Markov parameters S̄ of the system in (12.50) in terms of measured
inputs and outputs according to the following algorithm from Juang et al. (1991):
1. Choose the number of observer Markov parameters to identify, l.
2. Construct the data matrices:
S = y 0 y 1 · · · yl · · · y m , (12.51)
u0 u1 · · · ul · · · u m
0 v0 · · · vl−1 · · · vm−1
V=. .. .. .. .. .. , (12.52)
.. . . . . .
0
0 ··· v0 · · · vm−l
T
where vi = uTi yiT .
The matrix V resembles B, except that has been augmented with the outputs
yi . In this way, we are working with a system that is augmented to include a
Kalman filter. We are now identifying the observer Markov parameters of the
δ δ
augmented system, S̄ , using the equation S = S̄ V. It will be possible to
identify these observer Markov parameters from the data and then extract the
impulse response (Markov parameters) of the original system.
δ δ δ
3. Identify the matrix S̄ of observer Markov parameters by solving S = S̄ V for S̄
using the right pseudo-inverse of V (i.e., SVD).
δ
4. Recover system Markov parameters, S δ , from the observer Markov parameters, S̄ :
δ
(a) Order the observer Markov parameters S̄ as follows:
δ
S̄0 = D, (12.53)
δ δ δ
h i
S̄ k = (S̄ )(1)
k
(S̄ )(2)
k
for k ≥ 1, (12.54)
δ δ δ
where (S̄ )(1)
k
∈ Rq×p , (S̄ )(2)
k
∈ Rq×q , and y0δ = S̄0 = D.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
250 Data-Driven Fluid Mechanics Chapter 12
k
δ δ
ykδ = (S̄ )(1) δ
Õ
k
+ (S̄ )(2)
i yk−i for k ≥ 1. (12.55)
i=1
Thus, the OKID method identifies the Markov parameters of a system augmented with
an asymptotically stable Kalman filter. The system Markov parameters are extracted
from the observer Markov parameters by (12.55). These system Markov parameters
approximate the impulse response of the system, and may be used directly as inputs
to the ERA algorithm.
ERA/OKID has been widely applied across a range of system identification tasks,
including to identify models of aeroelastic structures and fluid dynamic systems.
There are numerous extensions of the ERA/OKID methods. For example, there are
generalizations for linear parameter varying (LPV) systems and systems linearized
about a limit cycle.
have dynamics f with only a few active terms in the space of possible right-hand side
functions; for example, the Lorenz equations only have a few linear and quadratic
interaction terms per equation.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 251
with the fewest possible nonzero terms in ξ. It is then possible to solve for the relevant
terms that are active in the dynamics using sparse regression (Tibshirani 1996, Zou &
Hastie 2005, Hastie et al. 2009, James et al. 2013), which penalizes the number of
terms in the dynamics and scales well to large problems.
First, time-series data are collected from (12.56) and formed into a data matrix:
T
X = x(t1 ) x(t2 ) · · · x(tm ) . (12.58)
In practice, this may be computed directly from the data in X; for noisy data,
the total-variation regularized derivative tends to provide numerically robust deriva-
tives (Chartrand 2011). Alternatively, it is possible to formulate the SINDy algorithm
for discrete-time systems xk+1 = F(xk ), as in the DMD algorithm, and avoid
derivatives entirely.
A library of candidate nonlinear functions Θ(X) may be constructed from the data
in X:
Here, the matrix Xd denotes a matrix with column vectors given by all possible time-
series of dth degree polynomials in the state x. In general, this library of candidate
functions is only limited by one’s imagination.
The dynamical system in (12.56) may now be represented in terms of the data
matrices in (12.59) and (12.60) as
Û = Θ(X)Ξ.
X (12.61)
Each column ξk in Ξ is a vector of coefficients determining the active terms in the kth
row in (12.56). A parsimonious model will provide an accurate model fit in (12.61)
with as few terms as possible in Ξ. Such a model may be identified using a convex
`1 -regularized sparse regression:
ξk = argminξ0 k X
Û k − Θ(X)ξ 0 k2 + λkξ 0 k1 .
k k (12.62)
k
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
252 Data-Driven Fluid Mechanics Chapter 12
Note that xk is the kth element of x, and Θ(x) is a row vector of symbolic functions of
x, as opposed to the data matrix Θ(X). Figure 12.4 shows how SINDy may be used to
discover the Lorenz equations from data.
The result of the SINDy regression is a parsimonious model that includes only the
most important terms required to explain the observed behavior. The sparse regression
procedure used to identify the most parsimonious nonlinear model is a convex
procedure. The alternative approach, which involves regression onto every possible
sparse nonlinear structure, constitutes an intractable brute-force search through the
combinatorially many candidate model forms. SINDy bypasses this combinatorial
search with modern convex optimization and machine learning. It is interesting to
note that for discrete-time dynamics, if Θ(X) consists only of linear terms, and if we
remove the sparsity promoting term by setting λ = 0, then this algorithm reduces
to the DMD (Rowley et al. 2009, Schmid 2010, Tu et al. 2014, Kutz et al. 2016). If a
least-squares regression is used, as in DMD, then even a small amount of measurement
error or numerical roundoff will lead to every term in the library being active in the
dynamics, which is nonphysical. A major benefit of the SINDy architecture is the
ability to identify parsimonious models that contain only the required nonlinear terms,
resulting in interpretable models that avoid overfitting.
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
Brunton: Methods for System Identification 253
Figure 12.5 Schematic overview of nonlinear model identification from high-dimensional data
using the sparse identification of nonlinear dynamics (SINDy) (Brunton et al. 2016a). This
procedure is modular, so that different techniques can be used for the feature extraction and
regression steps. In this example of flow past a cylinder, SINDy discovers the model of Noack
et al. (2003). Reproduced from Brunton et al. (2016a).
& Brunton 2018, Loiseau et al. 2018). Recent studies have also leveraged SINDy
for turbulence modeling (Beetham & Capecelatro 2020, Schmelzer et al. 2020).
Figure 12.5 illustrates the application of SINDy to the flow past a cylinder, where
the generalized mean-field model of Noack et al. (2003) was discovered from data.
SINDy has also been applied to identify models in nonlinear optics (Sorokina
et al. 2016), plasma physics (Dam et al. 2017), chemical reaction dynamics (Hoffmann
et al. 2019), numerical algorithms (Thaler et al. 2019), and structural modeling (Lai &
Nagarajaiah 2019), among others (Narasingam & Kwon 2018, de Silva et al. 2019, Pan
et al. 2020).
Because SINDy is formulated in terms of linear regression in a nonlinear library, it
is highly extensible. The SINDy framework has been recently generalized by Loiseau
and Brunton (Loiseau & Brunton 2018) to incorporate known physical constraints
and symmetries in the equations by implementing a constrained sequentially thresh-
olded least-squares optimization. In particular, energy-preserving constraints on the
quadratic nonlinearities in the Navier–Stokes equations were imposed to identify
fluid systems (Loiseau & Brunton 2018), where it is known that these constraints
promote stability (Majda & Harlim 2012, Balajewicz et al. 2013, Carlberg et al. 2017).
This work also showed that polynomial libraries are particularly useful for building
models of fluid flows in terms of POD coefficients, yielding interpretable models
that are related to classical Galerkin projection (Brunton et al. 2016a, Loiseau &
Brunton 2018). Loiseau et al. (2018) also demonstrated the ability of SINDy to
identify dynamical systems models of high-dimensional systems, such as fluid flows,
from a few physical sensor measurements, such as lift and drag measurements on
the cylinder in Figure 12.5. For actuated systems, SINDy has been generalized to
include inputs and control (Brunton et al. 2016b), and these models are highly effective
for model predictive control (Kaiser et al. 2018). It is also possible to extend the
SINDy algorithm to identify dynamics with rational function nonlinearities (Mangan
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
254 Data-Driven Fluid Mechanics Chapter 12
et al. 2016), integral terms (Schaeffer & McCalla 2017), and based on highly corrupt
and incomplete data (Tran & Ward 2016). SINDy was also recently extended to
incorporate information criteria for objective model selection (Mangan et al. 2017),
and to identify models with hidden variables using delay coordinates (Brunton
et al. 2017). Finally, the SINDy framework was generalized to include partial
derivatives, enabling the identification of partial differential equation models (Rudy
et al. 2017, Schaeffer 2017).
h d i 0 0 9 8 088962 4 0 8 bli h d li b C b id i i
13 Modern Tools for the Stability
Analysis of Fluid Flows
P. J. Schmid
13.1 Introduction
Stability analysis is a key discipline in fluid dynamics and is ubiquitous in the fluid
dynamics literature, either as a way of analyzing complex fluid behavior or as a means
to utilize it to manipulate intrinsic fluid motion. Instabilities are often postulated as the
driving force for pattern formation, for the rise of specific scales, or for the bifurcation
into a different flow regime.
Instabilities can be observed all around us. The flow patterns forming behind bluff
objects, the breakup of water jets and drops, the clustering of stars in rotating galaxies,
the formation of thermal plumes, the shaping of glaciers and stalactites/stalagmites in
caves, the sand ripples in river beds, and desert dunes fall among applications where
stability analysis of the governing equations has contributed to our understanding of
the dominant physical processes at play (Figure 13.1).
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
256 Data-Driven Fluid Mechanics Chapter 13
Buoyancy-driven
Vortical
Interfacial
Rotational
Shear-driven
Stratification-driven
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 257
instabilities in the inlet of a jet engine lead to non-smooth operation, while acoustic
instabilities lead to premature material fatigue; and strong magnetic instabilities are
one of the key reasons why nuclear fusion has not yet matured into a reliable technol-
ogy. In natural settings, instabilities in the atmospheric boundary layer are responsible
for weather abnormalities, and in lakes, rivers, and oceans they influence nutrient
transport.
The underlying mechanisms for instabilities can be rather multifaceted. Yet, it is
common to classify instabilities by their principal physical mechanism responsible for
the growth (or decay) of disturbances: buoyancy-driven, shock-induced, rotational,
magnetic, shear-driven, morphological, vortical, interfacial, thermal, reactive, acous-
tical, chemical, and so on. Instabilities have been studied intensively over the past
decades. Along with this categorization, each effect is characterized and parameterized
by a nondimensional number, such as the Reynolds number, Rayleigh number, Rossby
number, Weber number, Mach number, Damkoehler number, and so on. These num-
bers quantify the relative importance of the involved processes and act as bifurcation
parameters that, after passing a critical value, establish the presence of an instability.
Besides the key physical processes and their nondimensional number, the size of the
disturbance background or of an initial condition, necessary to induce an instability, is
important. Infinitesimal disturbances describe the early departure from an established
equilibrium state. They have the added mathematical advantage that linearization is
easily justified. The same is not true for finite-amplitude disturbances whose analysis
requires more sophisticated mathematical techniques. In what follows, we will outline
techniques for a general stability analysis, but will focus on common techniques for a
linear analysis.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
258 Data-Driven Fluid Mechanics Chapter 13
shear flows. The incorporation of nonlinear saturation effects has further extended the
range of applicability of stability results to finite-amplitude perturbations.
The advent of the computers and numerical algorithms has also greatly influenced
hydrodynamic stability theory, and it did not take long before the central equations of
stability theory have been solved numerically to high precision and for increasingly
complex flows, such as high-speed, compressible flows.
At the same time, transition to turbulence and hydrodynamic stability have been
linked in an effort to predict the onset of turbulent fluid motion using stability
calculations. Secondary instability theory, a two-stage stability concept, has been
introduced and proposed as the route of many flows to a highly disordered state
(Orszag & Patera 1983, Herbert 1988, Koch et al. 2000).
A new development emerged in the 1990s whereby the original Lyapunov stability
concept has been called into question, as it does not contain any notion of a time
horizon. In this vein, eigenvalues have been increasingly superseded by more abstract
operator concepts (Butler & Farrell 1992, Reddy & Henningson 1993, Trefethen
et al. 1993, Schmid & Henningson 2001, Schmid 2007). This non-eigenvalue-
based stability theory has succeeded in uncovering and explaining a great deal of
experimentally observed phenomena and is often a key component of instability and
amplification processes.
The past decade has continued to produce a great many tools related to stability
theory: direct numerical simulations produce high-fidelity fluid solutions and give
unprecedented insight into all aspects of transport processes and instabilities; the
parabolized stability equations (PSE) provide a powerful and efficient tool for the
calculation of stability characteristics in configurations that go far beyond the early
simple geometries (Bertolotti et al. 1992). The stability of flow can also be deter-
mined globally using high-performance computers and iterative eigenvalue algorithms
(Theofilis 2011). Following these techniques, stability properties have been computed
for such complex flows as airfoil sections, turbomachinery stages, or wing-tip vortices.
Moreover, variational techniques (Hill 1995), which recast stability concepts into the
form of an optimization problem, have resulted in novel and powerful tools that extract
stability information (even in the nonlinear case) directly from simulation software. It
is these latter optimization techniques – in a rudimentary form – that will be the focus
of this chapter.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 259
q0 δ q0 δ q0 δ
ε ε
q(t)
qe qe qe
q(t) q(t)
Figure 13.2 Stability concepts, illustrating (a) Lyapunov stability, (b) asymptotic stability, and
(c) exponential stability.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
260 Data-Driven Fluid Mechanics Chapter 13
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 261
A Supercritical A Subcritical
(a) p p (b)
pcrit pcrit
frequencies, modal shapes) may misrepresent the dynamic features of a flow that is
characterized and dominated by fluid processes on finite timescales.
Critical Parameters
The equations governing the temporal evolution of the state variable, denoted
symbolically by the function f, commonly contain a variety of physical parameters
often expressed as nondimensional numbers (such as the Reynolds number, Rayleigh
number, Mach number, etc.) or parameters linked to the shape of the disturbances
(such as their wavenumbers in homogeneous coordinate directions). The stability
properties thus depend on these parameters as well, and a specific perturbation may
be stable at one parameter setting, but become unstable at another. The parameter
value(s) at which this transition from stability to instability occurs are known as
critical parameters. Different critical parameters can be computed, depending on the
chosen definition of stability; for example, the critical Reynolds number for monotonic
stability of plane channel flow is Rec = 49.6, while the critical Reynolds number
for asymptotic stability of the same flow is Rec = 5772.2, where the Reynolds
number is based on the centerline base velocity and the half-channel height (Joseph &
Carmi 1969, Orszag 1971).
Bifurcation Behavior
Once we surpass the critical parameter, an instability ensues and the disturbance
grows in amplitude/energy until nonlinearities saturate the growth and establish a new,
nonlinear equilibrium state of finite amplitude. Finite-amplitude states, however, can
also exist (due to a conditional stability; see the earlier definition) at parameter values
below the critical one.
The existence of these finite-amplitude states determines the bifurcation behavior
of the flow (see also Chapter 10 for a discussion of bifurcations in dynamical
systems). Two cases have to be distinguished: supercritical and subcritical bifurcation
behavior. In the supercritical case, finite-amplitude states exist only past the point
(in parameter space) where the equilibrium state has gone asymptotically unstable to
infinitesimal perturbations. This situation is illustrated in Figure 13.3(a). Infinitesimal
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
262 Data-Driven Fluid Mechanics Chapter 13
perturbations below pcrit are asymptotically stable; for p > pcrit, finite-amplitude states
exist owing to an asymptotic instability of the infinitesimal state (dashed line). In
the subcritical case (see Figure 13.3(b)), we have a parameter regime where finite-
amplitude states coexist with asymptotically stable infinitesimal states. In this regime,
the infinitesimal state is conditionally stable: for an initial energy below a critical
value, the perturbation returns to the infinitesimal state (open circle); while for an
initial energy surpassing a threshold value (indicated by the dashed blue curve),
a higher-energy state is approached at the same parameter value p (closed circle).
After passing pcrit, this threshold value is zero, indicating that infinitesimal energy is
necessary to approach the higher-energy state via an asymptotic instability.
Examples of supercritical bifurcation behavior in fluid systems include Raleig-
Bénard convection and Taylor–Couette flow (flow between two differentially rotating
coaxial cylinders) within certain parameter regimes, while wall-bounded shear flows
such as plane Poiseuille, plane Couette, or pipe flow are governed by subcritical
bifurcations.
Most treatises on hydrodynamic stability theory start with many simplifying assump-
tions about the flow (steady, parallel, unidirectional, incompressible base flow of
Newtonian fluid), guiding the reader toward a set of mathematical techniques for the
solution of the resulting stability equations. Only in a second step are some of the
initial assumptions relaxed or eliminated, leading to additional mathematical compli-
cations. This simple-to-complex route has also been the historic route: simplifications
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 263
2π/α
t0 t > t0
x x
Single wavenumber Growth in time
ω ω
t0 t > t0
x x
Single frequency Growth in space
t0 t > t0
x x
Multiple wavenumbers and frequencies Growth in space and time
Figure 13.4 Stability formalisms for a temporally evolving perturbation (top), a spatially
evolving disturbance (middle), and a general spatio-temporal evolution from a point source
(bottom).
have been necessary, since the mathematical tools have been restricted to analytical,
approximate, or asymptotic methodology, such as, for example, perturbation and
asymptotic methods. With the advent of computational and data-driven techniques,
we are now in a position to address directly – with few imposed restriction and forced
assumptions – the hydrodynamic stability behavior of complex flows. The analysis of
unsteady, separated, and multiscale flows are now within our range, as is the nonlinear
evolution of disturbances in these configurations.
For this reason, we will advocate a reverse route of exposition: starting from the
general, unrestricted stability problem to the more confined, but perhaps more familiar
setup. We will describe a rather general computational framework for hydrodynamic
stability problems based on the optimization of a cost functional subject to constraints,
among them our nonlinear partial differential equation governing the fluid motion.
This formalism is very flexible in describing and quantifying the perturbation dynam-
ics under minimal restrictions and limitations. After having established this general
framework, we will then make progressive assumptions – either about the base flow
or the perturbation dynamics – to find reduced descriptions of the general formulation.
This will naturally introduce familiar techniques for the analysis of a fluid system’s
stability characteristics.
The general framework for hydrodynamic stability analysis relies on a math-
ematical formalism that uses calculus of variations, optimization techniques, and
concepts from linear algebra. A basic knowledge of these disciplines is helpful
in understanding the derivations later; it can be gained from excellent textbooks,
monographs, and review articles, such as Kot (2015) and Cassel (2013) for calculus of
variations, Gunzburger (2003) and Magri (2019) for adjoint techniques, Kochenderfer
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
264 Data-Driven Fluid Mechanics Chapter 13
and Wheeler (2019) and Nocedal and Wright (2006) for optimization, and Trefethen
and Bau (1997) for linear algebra, among many other resources.
† d
L(q, q , f) = J (q, f) − q , q − N(q, f) .
†
(13.4)
dt
This mathematical step required the introduction of Lagrange multipliers q† and
the choice of a scalar product. It follows from expression (13.4) (more specifically,
from term with the scalar product) that the Lagrange multipliers q† have the same
dimensionality as q, that is, q† ∈ Cnq . We thus realize that the Lagrange multipliers
represent time-dependent “flow fields.” At this stage, however, we do not yet have an
equation governing their temporal evolution. Proceeding, we define a scalar product as
∫ τ
ha, bi ≡ a H b dt. (13.5)
0
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 265
The superscript H stands for the transpose conjugate operation. In the previous
definition, we have introduced the time horizon τ, which has to be user-specified and
adapted to the relevant timescales of the flow under consideration. In what follows,
it will prove mathematically advantageous to recast the cost functional J in terms of
the scalar product (13.5). We write
where we included a weight function w(t) that is responsible for the enforcement of
j within the time interval [0, τ]. As an example, specifying a cost functional j over
the entire time horizon, we choose w = 1; for focusing only on the value of j at the
end of the time horizon, we choose w = δ(t − τ).
With the augmented Lagrangian L fully specified, we seek an optimum of L
with respect to all independent variables, that is, q, q†, f. This mathematical problem
requires calculus of variations: we determine three time-dependent functions (q, q†, f)
that optimize a scalar value (our augmented Lagrangian L). Analogous to computing
an optimum for a function of three independent variables f (x, y, z) by setting the first-
order partial derivatives with respect to the three variables to zero, fx = fy = fz = 0,
we have to take the first-order variations of the scalar L (see (13.4)) with respect
to the three functions q, q†, f and set them to zero simultaneously (see, e.g., Cassel
(2013) or Kot (2015)). The resulting three conditions are referred to as the Karush–
Kuhn–Tucker (KKT) system. Formally, we have
δL δL δL
= 0, = 0, = 0, (13.7)
δq† δq δf
a condition that has to hold true for all variations δq† . This expression thus recovers
our original governing equation for q.
The second condition is slightly more involved, as the state variable q appears in
the governing equation as well as in the cost functional. We have
* H H +
∂j d ∂N ∂ j d ∂N
w, δq − q†, δq − δq = w + q† + q†, δq = 0,
∂q dt ∂q ∂q dt ∂q
| {z }
=0
(13.9)
where integration by parts (in time) is required to transfer the d/dt-operator onto the
adjoint variable q† . After isolating the first variation δq in the scalar products, we can
extract an evolution equation for the adjoint variable q† .
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
266 Data-Driven Fluid Mechanics Chapter 13
from which we can extract an algebraic equation for the cost functional gradient with
respect to the control vector f.
In summary, we obtain from the three optimality conditions the following three
equations:
δL d
=0: q = N(q, f), (13.11a)
δq† dt
H H
δL d ∂N ∂j
=0: − q† = q† + w, (13.11b)
δq dt ∂q ∂q
H H
δL ∂j ∂N
=0: w=− q† . (13.11c)
δf ∂f ∂f
By design, this set of equations provides an extremum (q, f) of the chosen cost
functional J . Analogously to finding the extremum of a multivariate function, the
system (13.11) has to be solved simultaneously. In practice, however, we approach
the solution in an iterative manner. We solve the two evolution equations (13.11a,b)
exactly, while we use the algebraic optimality condition (13.11c) iteratively to
converge toward an optimal control variable f. The above optimization problem is the
basis for our analysis of fluid problems as to their stability, receptivity, and sensitivity
to internal or external changes. At the same time, the optimization framework (with
only minor modifications) can also be used to compute optimal flow control strategies
or optimal designs of fluid devices.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 267
f (0)
d
q = N(q, f (i) )
dt
i →i+1
∂j (i)
f (i+1) = opt ,f t
∂f
0 Until convergence τ
H H
∂j ∂N
w=− q†
∂f ∂f
∂N H †
H
d † ∂j
− q = q + w
dt ∂q ∂q
Figure 13.5 Schematic of adjoint looping. An initial guess f (0) for the control variable is used
to solve the governing equations over a time horizon t ∈ [0, τ]. This is followed by a solution
of the adjoint equations backward in time (from t = τ to t = 0), accounting for the driving by
the forward solution q. The adjoint solution q† is then substituted into the cost functional
gradient ∇f j, which in turn is passed to an optimization algorithm opt. A new and improved
control variable f is determined, and another iteration starts. The iterations are continued until
convergence is reached.
satisfied, or the computational resources are exhausted. Figure 13.5 gives a schematic
representation of this iterative process, often referred to as adjoint looping.
A few observations about the full system (13.11) and the iterative scheme are worth
pointing out.
(i) While the direct problem (13.11a) may be nonlinear, the adjoint equation (13.11b)
is always linear in q† . This fact results from the linear appearance of q† in the
augmented Lagrangian. The linearity of the adjoint equation can be exploited in
parallel-in-time algorithms for the adjoint part of the loop (Skene et al. 2020);
increased performance and computational efficiency can be gained in this manner.
(ii) Expression ∂N/∂q is recognized as the Jacobian of the governing equation (13.3),
and constitutes a matrix of size nq × nq . For nonlinear N, the Jacobian appears in
complex conjugate form in the adjoint equation and is evaluated at the time-local flow
field q. Details on how to compute this Jacobian – or more specifically its action on a
given state vector – are given below.
(iii) The system matrix (∂N/∂q) H for the adjoint equation (13.11b) does not
dependent on the cost functional J . The influence of J in the adjoint equation only
stems from the external forcing term (∂ j/∂q) H w. Different cost functionals produce
different adjoints via this forcing term, even though the inherent system dynamics is
independent of our choice of cost functional.
(iv) Given a nonlinear N or a non-convex cost functional J, any optimization scheme
based on gradient information can only reach a local extremum. No guarantee can
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
268 Data-Driven Fluid Mechanics Chapter 13
be given for convergence toward a global optimum. The type of optimum (minimum
or maximum) can be determined by monitoring the cost functional; a sign change in
applying the gradient-based update may be necessary.
Despite some obvious limitations, the general optimization formalism is versatile
and flexible, and can be brought to bear on a wide range of fluid problems.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 269
the control vector do not necessarily have to coincide; for this reason, we choose two
different weight matrices in the respective norms. Furthermore, we evaluate the gain
at the end of the time horizon, at t = τ, which coincides with more traditional notions
of (in)stability. This choice of evaluation leads to w = δ(τ − t) in (13.6). We have
kqkQ2 q H Qq
j(q, f) = = , (13.14)
kf kR2 f H Rf
with R denoting the weight matrix for the control variable, which in our case is taken
as the initial condition, that is, f = q(t = 0). Consequently, the evolution operator
takes on the form N(q, f) = Lq + fδ(t).
With this setup, we can reconsider the general iterative optimization framework and
use the linearity of our governing equations to make simplifications. The first step in
the direct-adjoint loop consists of solving the governing equations (13.11a) over the
time interval [0, τ]. The solution of this problem can formally be written as
q(τ) = exp(Lτ) f, (13.15)
introducing the matrix exponential that transforms the initial condition f into the
output perturbation q(τ). The second step, based on equation (13.11b), simplifies for
our LTI-case1
d 2Q
− q† = L H q† + wq, (13.16)
dt kf kR2
which again can be solved formally according to
2Q
q† (0) = exp(L H τ) q(τ). (13.17)
kf kR2
The third and final step uses the cost functional gradient
H
∂j
w = −δ(t)q†, (13.18)
∂f
where the right-hand side term, δ(t), is equivalent to the gradient ∂N/∂f for our special
case.
At this stage, we would use the aforementioned gradient in a user-specified
optimization routine (such as conjugate gradient, for example) to arrive at an
improved initial condition f. In this special case, however, we further simplify the
direct-adjoint loop (Figure 13.5) by choosing a steepest-descent procedure as our
gradient-based optimization routine. This results in an explicit expression for the new
initial condition. We have
kqkQ2
H
∂j
w=− 2Rf w. (13.19)
∂f kf kR4
Combining (13.15), (13.17), (13.18), and (13.19), we obtain
kq(τ)kQ2
R−1 exp(L H τ) Q exp(Lτ) f = f. (13.20)
kf kR2
1 Recall that the derivative of the quadratic form x H Ax with respect to x is 2Ax for symmetric A.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
270 Data-Driven Fluid Mechanics Chapter 13
describing a full iteration of the direct-adjoint loop. We have introduced the energy
gain σ 2 = kq(τ)kQ2 /kf kR2 , together with the transformed matrix L̄ = FLG−1 and the
transformed initial condition f̄ = Gf. The matrix R = G H G has been decomposed into
its Cholesky factors G.
Equation (13.21) is an eigenvalue problem for the energy gain σ 2 . It then follows
that the largest eigenvalue produces the optimal energy growth, and the associated
eigenvector f̄ gives the optimal initial condition f. The two matrix exponentials
in (13.21) are conjugate transpose to each other; the product yields real gains σ 2 .
This configuration also suggests that the eigenvalue problem can be transformed to
a singular value problem for exp(L̄τ): the largest singular value (i.e., the L2 -norm
of the matrix exponential) produces σ, the principal left singular vector corresponds
to f and the principal right singular vector gives the corresponding optimal output
q(τ). This latter relation is at the core of nonmodal stability theory: the norm of
the matrix exponential measures the maximum transient energy amplification. The
above demonstrates that it can now be thought of as a special case of the direct-adjoint
looping procedure for LTI systems. The iterative optimization procedure is equivalent
to a power iteration applied to the composite matrix in (13.21), and the optimization
problem can be solved by straightforward linear algebra operations.
kqw kQ2
j(q, f) = , (13.22)
kf kR2
where qw denotes the spatial shape of the forced response q = qw exp(iωt). We note
that j(q, f) is independent of time, and thus we do not need to specify w, but still
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 271
assume it as a constant. The above analysis is linked to transfer functions, and the
reader is urged to compare with related material in Chapters 4 and 9.
The direct problem (13.11a) reduces to
∫ t
q(t) = exp(L(t − t 0)) f exp(iωt 0) dt 0, (13.23a)
0
= (iω − L)−1 [exp(iωt) − exp(Lt)] f. (13.23b)
We see that equation (13.23a) represents a convolution of the harmonic forcing with
the input response of the linear system (see also Chapter 5 for additional material).
Before moving forward, we stress that the above solution assumes an asymptotically
stable system, with all eigenvalues of L contained in the stable half-plane. In essence,
we only consider the long-term response of the system and ignore transient processes
of establishing this long-term response. As a consequence, the term in (13.23b)
containing the matrix exponential can be neglected. We have
q = (iω − L)−1 f exp(iωt) = qw exp(iωt). (13.24)
Reformulating the adjoint equation in (13.11b) for our special case, we obtain
d † 2Q
− q = L H q† + qw exp(iωt) w, (13.25)
dt kf kR2
which has the formal solution
2Q
q† = (−iω − L H )−1 qw exp(iωt) w. (13.26)
kf kR2
From the optimality condition (13.19) and the expression for q† we get
kqw kQ2
H
∂N
− 2R f w = q† = exp(−iωt) q† . (13.27)
kf kR4 ∂f
Combining with (13.24) and (13.26) yields
R−1 (−iω − L H )−1 Q (iω − L)−1 f = σ 2 f, (13.28)
with σ 2 as the sought-after response gain, defined as kqw kQ2 /kf kR2 . Using the Cholesky
factorization of the weights Q and R (see above), we can write the above expression
more compactly as
H
(iω − L̄)−1 (iω − L̄)−1 f̄ = σ 2 f̄, (13.29)
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
272 Data-Driven Fluid Mechanics Chapter 13
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 273
The optimal solution is then given by a singular value decomposition that results
in the optimal gain (largest singular value), the optimal input (principal left singular
vector), and the associated output (principal right singular vector); see Chapter 6 for
additional material. The connection between optimization and linear algebra seems
rather attractive, but does come with its limitations, particularly when we consider
more complex flow configurations.
The foremost limitation is the restriction to linear and time-invariant systems.
For nonlinear or nonautonomous linear systems, we have to revert to the iterative
optimization techniques. Another limitation stems from the definition of the cost
functional as a gain or a quadratic form. The stability/receptivity of some fluid
problems require “exotic” norms that cannot be reduced to L2 -norms. In many other
circumstances, we may wish to augment the cost functional by terms and norms that
enforce additional constraints, for example, spatial sparsity. In other cases, the norm
may not include all components of the state vector, resulting in the so-called semi-
norm problem and requiring side constraints to ensure convergence of the iterative
optimization scheme.
The most obvious advantage of the optimization approach lies in its versatility to
adapt to/accommodate linear as well as nonlinear governing equations, to L2 -norms
as well as other, more exotic norms, to gains as well as to more general functionals,
and to additional side constraints.
High Dimensionality
Whether using the iterative direct-adjoint looping or the special formulation as a
linear-algebra problem, any effort should be taken to reduce the number of degrees of
freedom and thus ensure a swift convergence to a solution. The most obvious reduction
for linear problems is the use of transforms in the homogeneous directions, which is
equivalent to a separation-of-variable approach. If our linear geoverning equations
are of constant-coefficient type in one (or more) of the spatial coordinates, we can
employ a Fourier transform in this directions and introduce an associated wavenumber.
We then have to solve the iterative or linear-algebra system for each wavenumber
(or wavenumber tuple) – a far more efficient undertaking than solving the global
governing equations. Other techniques to reduce the number of degrees of freedom
involve similarity transformations or the exploitation of symmetries.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
274 Data-Driven Fluid Mechanics Chapter 13
H H
∂N ∂j
, . (13.31)
∂q ∂q
d
q = N(q) = N2 (Dx q, Dy N1 (Dx q)), (13.32)
dt
for which we determine the adjoint evolution matrix for the second leg of the direct-
adjoint iterative optimization. The functions N1 (q) and N2 (q, q) are assumed nonlinear
in their argument(s), but acting locally on the grid points. We next break the composite
right-hand side of (13.32) into procedural steps (similar to the modules/subroutines of
a computer program), introducing auxiliary variables for each step as needed. We
obtain
q0 = q, (13.33a)
q1 = N1 (Dx q0 ), (13.33b)
q2 = N2 (Dx q0, Dy q1 ), (13.33c)
d
q = q2 . (13.33d)
dt
The traversal from a given flow field q to its final time rate of change dq/dt is given
by a directed, acyclic graph (DAG) connecting the various modules. For simplicity,
we assume that numerical differentiation (indicated by multiplication by Dx or Dy ) is
a linear operation.
We first linearize the process of evaluating N(q). With differentiation assumed
linear, we only have to linearize the modules N1 and N2 . This is accomplished by
a simple Taylor-series expansion about the linearization state (generally, the base flow
denoted by q̄); we have
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 275
q0 = q, (13.35a)
q1 = A1 Dx q0, (13.35b)
q2 = A2,0 Dx q0 + A2,1 Dy q1, (13.35c)
d
q = q2 . (13.35d)
dt
Combining all steps into one, results in the linearized governing equations. They
read
d
q = A2,0 + A2,1 Dy A1 Dx q. (13.36)
dt | {z }
dN
dq
The evolution equation for the adjoint variables involves the complex transposition of
this equation. We have to form
d †
q = Dx H A2,0H
+ A1H Dy H A2,1
H
q†, (13.37)
dt
| {z }
dN H
!
dq
or, broken down into procedural steps,
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
276 Data-Driven Fluid Mechanics Chapter 13
transpose version. The involved operators are, however, readily available, in their
original and conjugate transpose form.
The aforementioned computational technique presents an efficient way of extract-
ing linearized and adjoint information required to perform the direct-adjoint optimiza-
tion (see Figure 13.5).
Since it is not linked to a specific set of equations, but rather to a simulation
code, it can easily be extended, without the need for further derivations. The gradient
information is numerically accurate to the precision of the chosen numerical scheme,
and requires a computational effort similar to the direct problem.
Krylov Time-Stepping
The previous sections have addressed the spatial discretization and the automated
extraction of linearized/adjoint information. The temporal evolution of the governing
equations also has to be considered carefully for an overall efficient optimization
procedure.
We advocate the use of Krylov time-stepping to advance the direct and adjoint
governing equations over the chosen time horizon [0, τ]. To this end, we approximate
the matrix exponential as the map over finite time interval, by forming a m-
dimensional Krylov subspace based on the evolution matrix L according to
where v represents a starting vector. We continue by projecting the matrix L onto this
subspace to obtain the approximation
H
L ≈ Qm Hm Qm , (13.40)
requiring a far smaller effort, since the matrix function evaluation is performed on the
reduced matrix Hm .
Still, even with these approximations, the repeated computation of the matrix-
function-vector product constitutes the most costly part of our time-stepping method.
For this reason, we choose a time-stepping scheme that uses a minimum number of
Krylov projections for a desired performance (accuracy).
We then use this reduced representation of matrix functions to design an exponen-
tial time-stepping scheme and apply it to a nonlinear system of ordinary differential
equations of the general form:
d
q = N(q), q(t0 ) = q0 . (13.42)
dt
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 277
where we used the notation L0 = ∇N| 0 . This expression is the starting point of
any exponential time-stepping method. Various manifestations of exponential time-
stepping schemes distinguish themselves by how the integral term is treated and
by how the second term on the right-hand side is approximated (Hochbruck &
Ostermann 2010, Schulze et al. 2009). The following scheme (referred to as the
exponential Rosenbrock-type scheme) blends many of the advantages of exponential
time-stepping using Krylov techniques to approximate the various matrix functions
arising in the formula. We have
∆t ∆t
k1 = q0 + ϕ1 L0 N(q0 ), (13.45a)
2 2
k2 = q0 + ∆t ϕ1 (∆tL0 ) N(q0 ) + ∆t ϕ1 (∆tL0 ) r(k1 ), (13.45b)
which requires three Krylov projections per time step. In the above time-stepping
scheme, we encounter higher exponential functions ϕk that can be derived from the
recurrence relation
1
ϕk (z) = + zϕk+1 (z), k = 0, 1, . . . , (13.46)
k!
with ϕk (0) = 1/k! and ϕ0 (z) = exp(z). They can be evaluated using the general matrix
function expression for a Krylov subspace framework.
Checkpointing
Using the direct-adjoint optimization scheme with a nonlinear governing equation or
a general form of the driving term (∂ j/∂q) H , the adjoint equation (13.11b) depends
not only on q†, but also on q. This link between the direct and adjoint equation leads
to additional computational challenges: we have to retain the flow fields q during the
temporal evolution of the direct problem, and inject them, in reverse order, into the
coefficients of the adjoint problem (13.11b). For small-size problems, we may have
sufficient memory to store all generated flow fields q during the direct part of the loop.
But even more moderately sized problems, we have to resort to a technique known as
checkpointing.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
278 Data-Driven Fluid Mechanics Chapter 13
During the computation of the direct solution, we store the flow fields q only at
a given number of checkpoints {tc }. These flow fields will subsequently be used
as initial conditions to recompute the necessary flow fields between checkpoints as
needed by the adjoint equation. Figure 13.6 demonstrates the two techniques: (i)
storing all flow fields, and (ii) storing only at specific checkpoint and recovering the
required flow fields by additional direct simulations. In the first case, we store all
flow fields at all time steps within the interval [0, τ]. These fields are then injected
(in reverse order) into the coefficients of the adjoint equation as well as the driving
term of the adjoint equation. No computational overhead arises. In the second case,
we assume memory restrictions that allow us to only store a finite number of flow
fields in the interval [0, τ] (in the illustration, seven checkpoints have been chosen).
The checkpoints are distributed unevenly, with the final points stored for every time
step. These last, densely stored fields are immediately used in the coefficients of the
adjoint equation during the beginning adjoint loop. Once these stored fields have been
used up, we recompute the direct flow fields starting from the next earliest checkpoint.
Once these fields have been restored, we continue with the integration of the adjoint
equation. This direct-recovery adjoint-integration procedure continues until we reach
the initial time t = 0. It is obvious that we trade efficiency in using the available
memory with inefficiency in simulation time, as most direct fields are computed twice:
once during the direct loop and another time during the recovery part of the adjoint
loop. The extra work is hence the cost of an additional direct loop. The distribution of
the checkpoints is critical for the overall efficiency; libraries dealing with the storage
and recovery management are readily available (Griewank & Walther 2000, Wang
et al. 2009).
Miscellaneous
The previous sections outlined the most common computational tools to speed up
the optimization by direct-adjoint looping. More recently, additional techniques have
arisen and are currently validated and implemented.
For large-scale systems and moderate-to long-time horizons, a parallelization
technique in time may be advantageous. These techniques, which break the full-
time interval [0, τ] into p disjoint subintervals on which the governing equations are
solved in parallel and subsequently adjusted and merged into a global solution, have
been developed over the past five decades and have seen many applications in the
computational sciences. Parallel-in-time techniques are particularly attractive for the
solution of the adjoint equations, as the adjoint equation is linear by design, and can
thus be solved efficiently by splitting the homogeneous and inhomogeneous part of
the solution and treating them parallel in time.
Model reduction techniques to speed up the direct and adjoint loop of the iterative
optimization scheme are also being developed for large-scale applications. These
techniques provide approximate gradient and sensitivity information based on a
reduced description of the full dynamics.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
https://doi.org/10.1017/9781108896214.019 Published online by Cambridge University Press
Direct Adjoint
t¯
(a)
t
Direct adj dir adj dir adj dir adj direct adjoint
t1 t¯
(b)
1 1
2
3
4
t2 5
6
7
2 2
3
4
t3 5
6
7
3 3
4
t4 5
6
7
4 4
t5 5
t6 6
t7 7
τ 5
6
7 extra work
t
Figure 13.6 Schematic of checkpointing. (a) Checkpoints at all 25 time steps of the direct problem are stored and used as variable coefficients in
the adjoint part of the iteration. (b) Checkpoint at 7 time steps (filled blue circles) are stored during the direct solution; they are used as starting
points (initial conditions) for launching direct simulations to restore direct solutions for the entire interval [0, τ]. We note that exactly seven
checkpoints are stored at all stages of the process.
280 Data-Driven Fluid Mechanics Chapter 13
Receptivity
Acoustics
Instab
ility
Instability
Figure 13.7 Sketch of tonal noise. Flow around a NACA-0012 airfoil at an angle of attack
develops a shear instability on the suction and pressure side. Noise radiates from the trailing
edge upstream and downstream, triggering via a receptivity process (indicated by red symbols)
instabilities in the pressure-side and suction-side boundary layer, hence closing the feedback
loop.
Tonal noise is an aeroacoustic process by which sound is radiated from a moving body
and maintained via a hydrodynamic-acoustic feedback mechanism. It has been studied
theoretically and experimentally over many years, and recently direct numerical
simulations have joined earlier investigations in trying to uncover the key elements
of the underlying feedback loop (Fosas de Pando et al. 2017). Tonal noise on airfoils
typically appears at chord-based Reynolds numbers of Re ∼ 105, with applications
to glider airplanes and wind turbines. Tonal noise in these latter applications is
undesirable, and control schemes – active or passive – are being developed to suppress
or eliminate radiated sound. In order to do so, a full understanding of the global tonal-
noise instability is required after which we can attempt to break the feedback loop at
the weakest link.
The optimization framework, outlined in this chapter, is particularly suited to
analyze the key elements of the tonal noise instability; the same framework can also be
used, in a second step, to design control schemes or to suggest geometric modifications
to weaken or suppress the undesired tonal effects.
Our understanding of tonal noise is based on a sequence of physical processes
as follows. The boundary layers on the suction and pressure side of the airfoil
support instabilities on either side. Convected downstream toward the trailing edge, the
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 281
100
0.2
10−1
0.0
M1 10−2
−0.2
M
L
H 10−3
ωi −0.4
10−4
−0.6
−0.8 10−5
−1.0 10−6
0 10 20 30 40 50 60 70
ωr
Figure 13.8 Global spectrum of the tonal noise problem in the complex frequency plane. The
eigenvalues are colored according to the size of the relative residual with respect to the
linearized operator. The spectrum is divided into least stable modes (labeled M),
low-frequency modes (labeled L) and high-frequency modes (labeled H).
instability structures collide and generate a noticeable sound wave. This wave travels
radially outward and reinitiates the instabilities in the pressure-side and suction-side
boundary layers through a receptivity process. The feedback loop is thus closed.
Figure 13.7 presents a sketch of this scenario. Two feedback loops – one based on
the pressure-side boundary layer, one based on the suction-side boundary layer –
linked at the trailing edge characterize the overall tonal noise process. It seems
obvious that a local stability analysis, based on frozen boundary-layer velocity profile,
cannot account for the complexity of the hydrodynamic-acoustic loop and cannot
quantitatively capture the main features. Instead, a global perspective is necessary,
and a DNS-based analysis beyond local instabilities should be attempted.
We consider a chord-based Reynolds number of Rec = 200,000, a Mach number
of Ma = 0.4, and describe the flow by the two-dimensional compressible Navier–
Stokes equation. The airfoil has a NACA-0012 profile and a 2o angle of attack to the
free-stream flow direction. The specific heat γ and Prandtl number Pr are taken as
constant, with values 1.4 and 0.71, respectively. At these parameter values, the mean
flow separates and reattaches on both sides.
We can gain insight into the tonal noise problem by first looking at the global
spectrum of the governing equations, linearized about the mean state (Fosas de
Pando et al. 2014). This is accomplished using an Arnoldi algorithm coupled to
a linearized simulation code for the physical problem; alternatively, a dynamic
mode decomposition (DMD) could have been used. As outlined earlier, we are
less interested in the modal solutions that represent the time-asymptotic structures
(possibly) encountered as τ → ∞, we compute the global spectrum (see Figure 13.8)
to provide a first picture of preferred frequencies supported by the linear dynamics.
The global spectrum consists of various branches, showing equispaced, discrete
eigenvalues. We classify these branches according to their frequency range or the
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
282 Data-Driven Fluid Mechanics Chapter 13
0.010
M1, p 0.008
1.0
(a) 0.006
0.5 0.004
0.002
0.0 0.000
y
−0.002
−0.5 0.05 −0.004
0.00
−0.006
−0.05
−1.0 −0.008
0.5 0.6 0.7 0.8 0.9 1.0 1.1
−0.010
−1.0 −0.5 0.0 0.5 1.0 1.5 2.0
x
108
(b)
0.6
M1, |u∗ |
0.4 107
0.2
0.0 106
y
−0.2
−0.4 105
−0.6
104
−0.5 0.0 0.5 1.0 1.5
x
Figure 13.9 (a) Spatial structure of the global mode labeled M1 , visualized by the real part of
the associated near-field pressure levels, and the real part of the streamwise velocity levels in
the vicinity of the aerofoil surface (inset). The mode has been normalized by the maximum
value of the velocity field in the near wake 1 < x < 1.2. (b) Spatial structure of the associated
adjoint global mode (labeled M1 in Figure 13.8), visualized by the magnitude of the
streamwise velocity levels, and the real part in the insets.
structure of the corresponding modes. They are outlined in Figure 13.8. Inspecting
the modes across the full frequency range reveals that only the modes labeled M,
which are also among the least damped modes, exhibit significant acoustic activity.
A similar analysis shows that the low-frequency modes (designated L in the figure)
represent the separation bubble dynamics and the reattachment dynamics. The high-
frequency modes (marked by H in the figure) describe the Kelvin–Helmholtz-type
shear layer instabilities on the suction side. The modes labeled H or L do not show a
large acoustic component, and thus shall be ignored in our analysis of the tonal noise
problem. Instead, we concentrate on the M-modes.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Schmid: Modern Tools for the Stability Analysis of Fluid Flows 283
We then focus on the least stable mode, labeled as M1 in the figure. A visualization
of its spatial structure (see Figure 13.9(a)) demonstrates all components of the
hypothesized feedback loop: the boundary layer on the suction side clearly shows an
instability, and the collision of instabilities from the suction and pressure side creates
sufficient shear to trigger an acoustic wave that emanates from the trailing edge and
radiates omnidirectionally. Other modes from the M-branch show a similar structure,
at different frequencies, but with only minor differences in their spatial composition.
After the dominant structure, with a significant acoustic component, has been
identified, we need to find its structural sensitivity from the adjoint structure. In
essence, this undertaking asks the question of how to trigger or influence the M1 -
mode in an optimal manner. The answer to this question should point at regions of the
flow where a minimal effort (input) is necessary to stimulate the occurrence of tonal
noise. The flow field adjoint to the M1 -structure will give this information, since it
is directly proportional to the amplitude of a general perturbation as it projects onto
M1 . In other words, the adjoint mode associated with M1 will identify the location,
shape, and state-vector components where the feedback loop that sustains tonal noise
(based on M1 ) can most easily be broken. Figure 13.9(b) shows the adjoint M1 -mode;
it has been visualized by the magnitude of the adjoint streamwise velocity. A localized
support of this structure is found on the pressure side of the airfoil, between roughly
20% and 50% of chord. We can argue that the tonal noise structure associated with M1
has its “origin” at this location, and it is at this location where tonal noise can most
easily be manipulated. Early experiments and in more recent numerical simulations
have proposed this position, although without an associated analysis.
The above analysis shows that complex stability/receptivity issues require math-
ematical and computational tools that transcend local analyses and time-asymptotic
assumptions. An optimization-based framework using direct as well as adjoint
information about the flow is sufficiently flexible and effective (once implemented in
an efficient manner) to answer many of the questions of flow analysis and to provide
a more complete and satisfactory picture of fluid flow behavior.
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
284 Data-Driven Fluid Mechanics Chapter 13
h d i 0 0 9 8 088962 4 0 9 bli h d li b C b id i i
Part V
Applications
14.1 Introduction
ROM has a long tradition in fluid mechanics. The focus of this chapter is gray-box
modeling,1 which resolves the coherent structure flow dynamics. Over 500 years ago,
Leonardo da Vinci made the first paintings of wake vortices. von Helmholtz (1858)
laid the foundation of their dynamic modeling with his famous vortex laws. In the
years that followed, many vortex models were proposed. Examples are a pair of two
equal vortices rotating about their axis, a pair of two equal but opposite vortices
moving uniformly, leapfrogging ring vortices, the Föppl (1913) vortex model of the
near wake, the von Kármán (1911) vortex model for vortex shedding, a shear-layer
vortex model (Hama 1962), the recirculation zone model (Suh 1993), the vortex-
model-based feedback control (Noack et al. 2004, Protas 2004), just to name a few.
The reduced-order vortex models inspired high-dimensional simulations with vortex
blobs (Leonard 1980), and vortex filaments (Ashurst & Meiburg 1988). Vortex models
are closely tied to first principles and robustly model coherent structures and their
convection.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
288 Data-Driven Fluid Mechanics Chapter 14
This section gives recipes and suggestions for the construction of POD Galerkin
models. First, the considered configurations are specified in Section 14.2.1. Then,
Sections 14.2.2 and 14.2.3 describe the kinematical and dynamical steps to a POD
model. Sections 14.2.4 and 14.2.5 provide closures for an existing POD model and
identification methods for experiment. Section 14.2.6 reviews common applications.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 289
∇ · u = 0, (14.1)
∂t u + ∇ · (u ⊗ u) = −∇p + ν4u. (14.2)
u(x, t) = 0. (14.3)
In addition, a free-stream, stress-free, or von Neumann condition may be used for open
flows.
The initial condition u(x, 0) at time t = 0 may be a perturbed steady solution u s (x)
as in Chapter 1 of this book.
In what follows, we assume to have gathered M snapshots u m , m = 1, . . . , M
from a post-transient flow solution. The snapshots are requested to be statistically
representative for the computation of first and second moments.
of a velocity field u(x, t), as detailed in Chapter 6. Here, u0 is the basic mode, for
example, the steady Navier–Stokes solution or the mean flow, and ui are expansion
modes with amplitudes ai . These approximations rely on a square-integrable Hilbert
space L 2 (Ω) equipped with an inner product between two elements v and w,
∫
(v, w)Ω := d x v · w. (14.5)
Ω
POD modes are an orthonormal set of expansion modes that minimize the in-
sample representation error
M
1 Õ m
E= k u − û m kΩ2 , (14.7)
M m=1
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
290 Data-Driven Fluid Mechanics Chapter 14
Computation of the basic mode: The basic mode is a time-averaged flow and
absorbs a potential inhomogeneity of the steady boundary condition,
M
1 Õ m
u0 (x) := u (x). (14.8)
M m=1
One example is the uniform free-stream condition u = U ê x in the far-field,
where ê x is the unit vector in streamwise direction. The remaining fluctuation
u 0 = u − u0 satisfies homogenized boundary conditions, for example,
u 0 = 0 in case of the uniform free-stream condition. This implies that any
expansion mode ui will satisfy these homogenized boundary conditions. In
other words, the expansion (14.4) satisfies the full boundary conditions for
arbitrary choices of amplitudes ai , i = 1, . . . , N.
Computation of the correlation matrix: The M × M correlation matrix R has the
elements
1
R mn := (u m − u0, u n − u0 )Ω , m, n = 1, . . . , M. (14.9)
M
Spectral analysis of the correlation matrix: This Gramian matrix R is symmetric
and positive semi-definite. This implies real and nonnegative eigenvalues as
well as orthogonal eigenvectors. Let
αi1
.
αi := ..
αM
i
be the ith eigenvector of the correlation matrix, that is,
R αi = λi αi , i = 1, . . . , M. (14.10)
Without loss of generality, we assume the real eigenvalues to be sorted in
decreasing order
λ1 ≥ λ2 ≥ · · · ≥ λ M = 0,
and that the eigenvectors are orthonormal
M
Õ
αi · α j = αim α m
j = δi j , i, j = 1, . . . , M.
m=1
Note that the Mth eigenvalue must vanish, because M snapshots span at most
an M − 1-dimensional hyperplane.
Computation of the POD modes: The POD modes are given by
M
1 Õ m m
ui (x) := √ αi (u (x) − u0 (x)) , i = 1, . . . , N. (14.11)
M λi m=1
Here, N ≤ M − 1 is the number of expansion modes in (14.4). The POD
modes are orthonormal by construction:
ui , u j Ω = δi j , i, j = 1, . . . , N.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 291
liνj = ui , 4u j Ω, (14.14a)
qicjk = ui , ∇ · u j ⊗ u k . (14.14b)
Ω
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
292 Data-Driven Fluid Mechanics Chapter 14
a) b) c)
20 20
a1 a1
0 0
-10 -10
-20 -20
0 200 400 600 time 1000 0 200 400 600 time 1000
Figure 14.1 POD Galerkin model for the 3D mixing layer. For details see Noack et al. (2004).
(a) Snapshot; (b) a1 of LES; (c) a1 of POD model.
system. This residual has two effects: a high-frequency noise excitation and an energy
dissipation. Typically, only the energy dissipation is accounted for.
Mean-flow model: Aubry et al. (1988) account for the stabilizing coupling between
fluctuations and mean-flow by replacing the constant basic mode with
variable one computed from the Reynolds equation. The shift mode (Noack
et al. 2003) has the same purpose for an oscillatory flow.
Global eddy viscosity: Aubry et al. (1988) employs a Boussinesq ansatz by replacing
the kinematic viscosity ν with an effective one νeff = ν + νT , where the global
eddy viscosity νT is a tuning parameter. This ansatz will be dissipative but
the Galerkin model inaccurately implies that a high-Reynolds number flow
behaves like the laminar solution at a low-Reynolds number.
Modal eddy viscosities: Rempfer and Fasel (1994a) has refined this Boussinesq
ansatz by a more realistic mode-dependent eddy viscosity – inspired by
spectral turbulence theory.
Nonlinear eddy viscosities: Global and modal eddy viscosities imply that a nonlin-
ear dynamics can be approximated by a linear Galerkin system term. Östh
et al. (2014) have refined this ansatz by a fluctuation-dependent scaling. The
scaling has been justified with a finite-time thermodynamics closure (Noack
et al. 2008) and can be shown to guarantee boundedness (Cordier et al. 2013).
We keep the list of closures this short and simple. Figure 14.1 previews results of a
modal eddy viscosity closure for a turbulent mixing layer.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 293
N N N
dai Õ ÕÕ
= fi (a)ci + li j a j + qi jk a j ak , (14.15)
dt j=1 j=1 k=j
This formulation requires the integration of the model in the time interval [t0, t1 ]. The
4D Var method is a powerful method for minimizing the model error (Semaan et al.
2015). Yet, the challenge is that a short interval may not provide enough information
about the dynamics while a long interval leads to inevitable “phase drifts” between the
full plant and the model. If the full plant and the model are “out of sync,” an overly
dissipative dynamics with a◦ ≡ 0 will become the best model.
An alternative optimization avoids this phase-drift problem. Now, the dynamics
residual is minimized along the full-plant trajectory t 7→ a• ,
∫t1
2
1
da
•
dt − f (a )
= min.
•
E2 := dt (14.17)
t1 − t0
t0
Here, a long time integral helps in the model accuracy as the whole attractor is
incorporated in the calibration. This optimization requires the temporal state derivative
and leads to an analytically solvable least-means-square problem. The challenge is that
a long-term accumulation of errors, like amplitude errors, can easily occur, because
they are weakly penalized by the error (14.17).
A large arsenal of methods have been developed for improving long-term asymp-
totics, for mitigating the effect of insufficient data (Cordier et al. 2010), and for sparsi-
fying the dynamics, that is, make it more human interpretable (Brunton et al. 2016a).
The reader is encouraged to visit the literature if needed.
14.2.6 Applications
Reduced-order models have many classical applications:
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
294 Data-Driven Fluid Mechanics Chapter 14
Estimation: A dynamic ROM may be the basis of a dynamic observer, that is, may
help to predict flow states based on one or few sensor signals.
Prediction: A dynamic ROM may help to predict future states. One example is an
undesirable flutter in an airplane experiment and an early shutdown of the
wind-tunnel. A second example is a weather forecast.
Exploration: A ROM may help to explore unobserved behavior for new initial
conditions or new parameters. Given the narrow range of validity of data-
driven ROM, this new behavior needs to be validated in the full plant.
Control: ROM may help in the control design of first- and second-order dynamics.
Closures: ROM may guide closure terms for unsteady dynamics in the spirit of Liu
(1989).
Response model: ROM may also provide a mapping from actuation or configuration
parameters to the time-averaged flow (Chapter 16) or the performance
(Albers et al. 2020).
14.3.1 Clustering
Following Chapter 1, we employ clustering as an autoencoder. Let ck (x), k = 1, . . . , K
be the centroids, that is, representative flow states. For a given flow field u, the encoder
provides the index of the closest centroid
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 295
a) b)
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
Figure 14.2 Cluster-based Markov model for the mixing layer. For the details, see Kaiser et al.
(2014). (a) Vorticity snapshot; (b) Markov matrix P.
In analogy to POD, the set of K centroids are chosen to minimize the in-sample
representation error
M
1 Õ m
E= k û − u m kΩ2 , (14.20)
M m=1
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
296 Data-Driven Fluid Mechanics Chapter 14
(CNM) stochastically generates a sequence of times t0, t1, t2, . . . and cluster affiliations
k 0 = 1, k1 , k2, . . . consistent with the mentioned direct transition probabilities Qi j and
transition times Ti j . The trajectory for t ∈ [tm, tm+1 ] moves uniformly from centroid
ckm to ckm+1 ,
tm+1 − t t − tm
u ◦ (x, t) = ck (x) + ck (x). (14.22)
tm+1 − tm m tm+1 − tm m+1
In contrast to CMM, this model has a time-continuous flow representation u ◦ (x, t),
that is, does not jump discretely between centroids. Moreover, the model-based
autocorrelation function is found to be much more accurate. We refer to Li et al.
(2021) for a detailed discussion of the model and the shear-flow example. The price for
this increased dynamic resolution is a slightly larger error in the model-based cluster
population.
14.3.4 Generalizations
CROM, in particular CNM, can be significantly generalized. One model may describe
many operating conditions, if we enrich the set of centroid correspondingly and
include the parameters b of the operating conditions in the model parameters, for
example, P(b) for the CMM and Q(b), T (b) for the CNM (Fernex et al. 2021, Li
et al. 2021). In addition, the assumed motion from centroid to centroid may be relaxed
to a “flight” using the centroids and their transition characteristics as lighthouses. With
these generalizations, a CROM has successfully approximated a dragreduction study
of wall turbulence with dozens of spanwise traveling surface waves (to be published
soon by the authors). Summarizing, CROM is based on a local interpolation between
centroids as “collocation points” of the data set, and there is no limit to the structure of
the data. The simple nature of clustering makes it a solid foundation for a very general
modeling toolkit. The loss of analytical relationships to first principles is rewarded by
significant gain in human-interpretable model complexity.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 297
The first row defines an autoencoder, that is, a low-dimensional flow representation
with an encoder to and decoder from the low-dimensional feature vector a. The
dynamics of this feature vector (bottom, middle) approximates the Navier–Stokes
equations (bottom, left). The POD and cluster-based models follow this scheme. Also
vortex models fit into this approach.
GBM models differ in the chosen autoencoder and the chosen model identification
of the dynamics f (a). The autoencoder may be mathematical (Fourier modes,
Tchebychev polynomials, etc.), physical (stability modes, Stokes modes, etc.), or
data-driven (POD, DMD, centroids, etc.). The dynamics may be derived from first
principles (Galerkin projection, etc.), or from data (model identification), or a mixture
of both (model identification with a Thikonov penalization of the Navier–Stokes-based
model, etc.). We shall not pause to declinate the different shades of gray, but refer to
the excellent literature on the topic (Fletcher 1984, Holmes et al. 2012).
Instead, we mention two different approaches. Feature-based manifold models
(FeMM) (Loiseau et al. 2018, 2021) assume a typical experimental situation. A sensor
signal s is recorded in time, lifted to a dynamic feature a for which a dynamic model
can be identified. The feature a is employed for an estimator of the flow using PIV and
simultaneous sensor data. K-nearest neighbor interpolation is a simple yet effective
method for state estimation. The following equation outlines the data structure:
Kinematics s → a → û(x)
↓ (14.24)
da
Dynamics = f (a).
dt
Effectively, FeMM is a BBM with a sensor-based estimator of a manifold.
Another approach relies on brute-force data interpolation. Let us assume that the
data contains the snapshots u m and the time derivative ∂t u m at the same instants, for
example, from double-shot PIV. Now, the flow field u ◦ can be integrated by an Euler
scheme,
u ◦ (t + ∆t) = u ◦ (t) + ∆t ∂t u ◦ (t).
In the simplest 1-nearest neighbor realization, the time derivative is taken to be ∂t u m ,
where u m is the nearest neighbor to u ◦ . Obvious refinements, like higher-order
interpolation and smoothing are possible and probably advised.
We pause the discussion of the spectrum of GBM enabled by machine learning.
Evidently, the choice of the GBM is guided by the available data and by the purpose
of the model. It pays to start with a simple purely data-driven model before advancing
to more refined and targeted ones. Moreover, every mapping can be identified with a
rich set of tools – from Taylor’s expansion to neural networks.
We conclude with a few words of warning. Data analysis is easy, insightful, and fun.
Even with validation procedures, results are guaranteed. Dead data cannot complain
anymore. Dynamic models attempt to predict the unknown future, which is easy
for attractor dynamics, that is, transitions between similar recurrent states, but is
next to impossible for extrapolation into terra incognita. Data-driven control-oriented
dynamic models attempt to predict the unknown future with a rich set of possible
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
298 Data-Driven Fluid Mechanics Chapter 14
actuations. The available data for control-oriented modes will typically be sparse, that
is, data has to be replaced by good simplifying guesses.
This tutorial aims to give readers a hands-on experience with the learned Galerkin pro-
cesses. This includes Galerkin expansion, Galerkin projection, and model calibration.
The tutorial shall solely rely on the numerical package xROM (Semaan et al. 2020),
which is a freely available tool for ROM. One purpose of xROM is to perform a modal
decomposition from snapshot data and to carry out the Galerkin projection using the
Navier–Stokes equations. Moreover, xROM
(i) can analyze native CFD and PIV data;
(ii) can handle a large spectrum of 2D and 3D grids (Cartesian, structured, unstruc-
tured, etc.);
(iii) can create modal expansions and other modes (e.g., shift modes);
(iv) is user-friendly through a single configuration file;
(v) is parallelizable.
On the full capabilities of xROM, the reader is referred to the user manual, which is
included in the package.
For this tutorial, we shall employ direct numerical simulations (DNS) of a cylinder
flow at Re = 100 following Chapter 1 and duplicating the results presented in Noack
et al. (2003).
The tutorial is organized as follows. In Section 14.5.1, we shall get acquainted with
the numerical setup and the considered flow. General characteristics of xROM and how
to use it are introduced in Section 14.5.2. We shall deploy xROM to perform three tasks
in Section 14.5.3.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 299
Figure 14.3 Unstructured grid for the FEM direct numerical simulation of cylinder wake.
Min=-04.998
Max= 05.027
that extend its capabilities and configurability. The main features that make the full
version of xROM attractive are presented in the following.
Preprocessing: Two tools are available for the preprocessing phase. First, a PIV
converter that directly converts PIV snapshots from their original .txt or
.dat format to snapshots readable by xROM. Second, a tool to reduce the
domain according to user-defined specifications, to compute the POD modes
only in a specific region of interest.
Formats: xROM currently reads/writes two formats: CFD General Notation System
CGNS, an efficient binary open-source format, and the ASCII format from
Tecplot, .dat, which is widely used in the research community.
Case types: xROM supports a wide variety of mesh types: Cartesian equidistant,
Cartesian, structured, and unstructured grids, in 2D and 3D.
Parallelization: The software is fully parallelized using Massively Parallel Imple-
mentation (MPI), which offers several benefits.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
300 Data-Driven Fluid Mechanics Chapter 14
(a) (b)
Figure 14.5 (a) Cylinder wake folder incorrectly selected. (b) Cylinder wake folder correctly
selected.
Input folder from the command line: In that case, simply specify the correct path
to the case folder when running the program, such as:
$ ./ xROM / absolute / path / to / folder
Please note that it has to be the absolute path, which can be found using:
$ pwd
Input folder from the pop-up: If no path is defined in the command line, a pop-up
window opens in which the user must select the right folder. In this case, the
correct command is:
$ ./ xROM
Be careful to correctly select the folder: it is valid only when the field
“Selection” of the window contains the full path, including the input folder.
In practice, the user has to click twice on the case folder (i.e., enter the
2 https://cloudstorage.tu-braunschweig.de/getlink/fi9HQ68borJL8dJ9t4Yxv2au/
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 301
folder) to get the path, unlike other common GUIs where the user only
have to click once. This is exemplified in Figure 14.5, which shows how
the cylinder wake folder is (a) wrongly and (b) correctly selected.
14.5.3 Exercises
As already mentioned in Section 14.5.1, we shall use DNS snapshots of a
cylinder flow, which are included in the package under CylinderWake/
InputData/Snapshots. Since all three exercises use the same data set, we
shall use the same case folder and simply generate (automatically) a different
run directory for each exercise. Before each run, we shall make the necessary
modifications to the configuration file.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
302 Data-Driven Fluid Mechanics Chapter 14
As you can see in ConfigFile, only the first step GalerkinExpansion is enabled.
The rest are disabled as these options are not relevant for this first exercise. Besides
activating GalerkinExpansion, the only other option we need to concern ourselves
with is the BaseFlow selection, which we choose as MeanFlow.
After executing the program using ./xROM, a new folder called OutputData will
be generated. Inside, the POD results are found under GalerkinExpansion. Here
you should find the output of the entire Galerkin expansion, which includes:
• The base flow as a Tecplot-compatible .dat file.
• The correlation matrix.
• The eigenvalues.
• The POD mode coefficients as an array, and as .png images under
PODAmplitudesPlot.
• The POD modes as Tecplot-compatible .dat files under PODModes.
• The POD spectrum as a .png image.
If you wish to plot the eigenvalues and the mode coefficients yourself, you can simply
disable the option plotFigures. All generated field data are saved in ASCII format
and are Tecplot-compatible.
Examining the modes and the mode coefficients reveals the expected results for this
shedding flow:
• The turbulent kinetic energy is highly concentrated in the first modes and drops
quickly with higher mode numbers.
• A strong mode pairing between (at least) the first 8 modes.
• The mode pairs exhibit a phase shift between them.
• Modes and mode coefficients are each dominated by a single frequency that
increases with higher mode pair.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Noack et al: Machine Learning for Reduced-Order Modeling 303
These coefficients are then used to generate the dynamical system, which is
integrated between time tStartDS and tEndDS. Inspecting the reference POD
mode coefficient against those of the dynamical system informs us of the model
accuracy. All eight reference and generated mode coefficients are found under
DynamicalSystem/DSAmplitudesPlot. Comparing both mode coefficients shows
a good match for the initial modes and a gradual degradation toward an unstable
behavior for the higher modes. Without calibration, this behavior is expected.
After rerunning the program, you can inspect the mode coefficients under
DynamicalSystem/DSAmplitudesPlot. The improvement should be clear. The
new model shows better accuracy and stability for all modes.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
15 Advancing Reacting Flow
Simulations with Data-Driven
Models
1 2 3 4
K. Zdybal , G. D’Alessio , G. Aversano , M. R. Malik ,
5
A. Coussement, J. C. Sutherland, and A. Parente
15.1 Introduction
The simulation of turbulent combustion is a very challenging task for a number of
aspects beyond turbulence. Combustion is intrinsically multiscale and multi-physics.
It is characterized by a variety of scales inherently coupled in space and time through
1 Kamila Zdybał acknowledges the support of the Fonds National de la Recherche Scientifique
(F.R.S.-FNRS) through the Aspirant Research Fellow grant.
2 Giuseppe D’Alessio acknowledges the support of the Fonds National de la Recherche Scientifique
(F.R.S.-FNRS) through the FRIA fellowship.
3 Gianmarco Aversano acknowledges the funding from the European Research Council (ERC) under the
European Union’s Horizon 2020 research and innovation programme under grant agreement No.
714605.
4 Mohammad Rafi Malik acknowledges the funding from the European Research Council (ERC) under
the European Union’s Horizon 2020 research and innovation programme under grant agreement No.
714605.
5 Alessandro Parente acknowledges the funding from the European Research Council (ERC) under the
European Union’s Horizon 2020 research and innovation programme under grant agreement No.
714605.
thermochemical and fluid dynamic interactions (Pope 2013). Typical chemical mecha-
nisms describing the evolution of fuels consist of hundreds of species involved in thou-
sands of reactions, spanning 12 orders of magnitude of temporal scales (Frassoldati
et al. 2003). The interaction of these scales with the fluid dynamic ones defines the
nature of the combustion regime as well as the limiting process in determining the
overall fuel oxidation rate (Kuo & Acharya 2012). When the characteristic chemical
scales are much smaller than the fluid dynamic ones, the combustion problem becomes
a mixing one (i.e., mixed is burnt (Magnussen 1981)): combustion and chemistry are
decoupled, and the problem is highly simplified. Likewise, for chemical timescales
much larger than the fluid dynamic ones, the system can be described taking into
account chemistry only, neglecting the role of fluid dynamics altogether.
The intensity of interactions between turbulent mixing and chemistry is measured
using the Damköhler number, defined as the ratio between the characteristic mixing,
τm , and chemical, τc , timescales:
τm
Da = . (15.1)
τc
Still, these high-fidelity simulations are rich in information that could help decode the
complexity of turbulence–chemistry interactions and guide the development of filtered
and lower-fidelity modeling approaches for faster evaluations.
The objective of the present chapter is to demonstrate the potential of data-driven
modeling in the context of combustion simulations. In particular, we present:
• The application of principal component analysis (PCA) and other linear and
nonlinear techniques to identify low-dimensional manifolds in high-fidelity
combustion data sets, and to reveal the key features of complex nonequilibrium
phenomena. Different techniques are compared to PCA, including nonnegative
matrix factorization (NMF), autoencoders, and local PCA in Section 15.3.
• The development of reduced-order models (ROMs), to be used in conjunction with,
or to replace high-fidelity simulation tools, to reduce the burden associated
with the large number of species in detailed chemical mechanisms. First, the
use of transport models based on PCA is presented in Section 15.4. Finally,
the application of the data-driven adaptive-chemistry approach based on the
combination of classification and chemical mechanism reduction is discussed
in Section 15.5.
The structure of combustion data sets differs from the one seen in pure fluid mechanics
applications (presented in Chapter 6). The data set encountered in multicomponent
reactive flows is stored in the form of a matrix X ∈ R N ×Q . Each column of X is tied
to one of the Q thermochemical state-space variables: temperature T, pressure p, and
Ns −1 chemical species6 mass (or mole) fractions, denoted by Yi for the ith species. For
open flames and atmospheric burners the pressure variable can be omitted. Each of the
N rows of X contains observations of all Q variables at a particular point in the physi-
cal space and/or time (and sometimes, a point in the space of other independent param-
eters, as briefly discussed later). This structure of the data matrix is presented below:
.. .. .. .. ..
. . . . .
X = T p Y1 Y2 ... YNs −1 . (15.2)
. .. .. .. ..
.. . . . .
For such a data set, Q = Ns + 1. We denote the ith row (observation) in X as xi ∈ RQ
and the jth column (variable) in X as X j ∈ R N . When the data set is only resolved
in space and not resolved in time, N represents the number of points on a spatial grid,
and X can be thought of as a data snapshot (a notion much like the one discussed in
Chapters 6–9). Typically, we can expect N Q. However, the magnitude of Q will
strongly depend on the number of species, Ns , involved in the chemical reactions, and
can even reach the order of thousands for more complex fuels (Lu & Law 2009).
6 Since mass (or mole) fractions sum up to unity for every observation, out of Ns species only Ns − 1 are
independent.
Figure 15.1 Examples of common numerical combustion data sets schematically presented on
the axis of increasing complexity.
for CFDF are solved in the physical space. Flow is parameterized by the velocity
gradient parameter a, which is an equivalent of the local strain rate. In the CFDF,
each row of X is linked to a point in the physical space and time.
• The one-dimensional turbulence (ODT) model (Kerstein 1999, Echekki et al. 2011)
can additionally incorporate the effects of turbulence by introducing eddy events
on a one-dimensional domain. It allows for both spatial and temporal evolution
of the flow. ODT can be used as a standalone model but can also serve as a
subgrid model in large eddy simulations (LES).
• The Reynolds-averaged Navier–Stokes (RANS) models (Ferziger et al. 2002) solve
the time-averaged equations describing fluid dynamics and hence any sources of
unsteadiness coming from turbulence are averaged.
• The LES (Ferziger et al. 2002) resolves large scales associated with the fluid
dynamics processes but a subgrid model is required to account for processes
occurring at the smallest scales. The choice for the subgrid model becomes
particularly important in reactive flows, since combustion is inherently tied to
the smallest scales.
• The DNS (Ferziger et al. 2002) allows for the most accurate description of the
coupled interaction between fluid dynamics and the thermochemistry since all
subgrid processes are resolved directly. The scarcity of the DNS data sets is due
to the large computational cost of performing DNS simulations, especially for
more complex fuels.
where X̄ contains the mean observations of each variable, D is the diagonal matrix of
scales, where the jth element d j from the diagonal is the scaling factor corresponding
to the jth state-space variable X j . A few of the common scaling criteria are collected
in Table 15.1. The result is a centered and scaled data matrix X.
e Other preprocessing
means can include outlier detection or data sampling. For the remainder of this chapter,
we will assume that Xe represents the matrix that has been adequately preprocessed.
where ai j are the elements (weights) from the jth column of A and tildes represent
the preprocessing applied to each state-space variable. Since the basis matrix is
orthonormal, ai j ∈ [−1, 1] and A−1 = A> .
By keeping a reduced number of the q < Q first PCs, we obtain the closest8 rank-q
approximation of the original matrix,
7 For the purpose of the feature extraction analysis presented here, all Ns species were included in the
data set.
8 in terms of L2 , Frobenius, or trace norms, which follows from the Eckart–Young–Mirsky theorem
(Eckart & Young 1936).
where the index q denotes the truncation from Q to q components. Note that the
reverse operation to the one defined in (15.4) has to be applied using matrices D and X̄.
We can assign physical meaning to the PCs by looking at the linear coefficients
(weights) ai j from the basis matrix A (Parente et al. 2011, Bellemans et al. 2018).
High absolute weight for a particular variable means that this variable is identified
by the PCA as important in the linear combination from (15.8). Moreover, since the
vector A j identifies the same span as −A j , only the relative sign of a particular weight
with respect to the signs of other weights is important. In addition, PCs are ordered
so that each PC captures more variance than the following one. Thus, we can expect
the most important features identified by PCA to be visible in the first few PCs. This
property of PCA can also guide the choice for the value q. Since PCs are decorrelated,
increasing q in the PCA reconstruction from (15.9) guarantees an improvement in the
reconstruction errors.
Preprocessing the data set, prior to applying a dimensionality reduction technique,
can have a significant impact on the shape of the low-dimensional manifold and on the
types of features retrieved from the data set (Parente & Sutherland 2013, Peerenboom
et al. 2015). Figures 15.2 and 15.3 present the weights ai j associated with the first PC
resulting from Range and Pareto scaling (see Table 15.1) of the original data set. First,
we observe that the structure of the PC can change significantly with the choice of
the scaling technique. If we further consider the mixture fraction variable as defined
in (15.3) as a linear combination of fuel YF and oxidizer YO2 mass fractions, it can
be observed that the coefficients in front of YF and YO2 are of opposite signs in that
definition. Using Range scaling (Figure 15.2), the first PC can be attributed to the
mixture fraction variable, where the only high weights are for the fuel (H2 and CO) and
oxidizer components (O2 and N2 ), and the two have opposite signs. The correlation
between the mixture fraction variable and the first PC is in this case 99.96%. It is
worth noting that the mixture fraction was not among the variables in the original
data set X and PCA identifies it automatically. With Pareto scaling (Figure 15.3), the
first PC is almost entirely aligned with the temperature variable and carries almost no
information about the mass fractions of the chemical species.
In a previous study (Biglari & Sutherland 2012), the authors have demonstrated that
PCA can identify PCs that are independent of the filter width on a fully resolved jet
flame. PCA was performed on the state-space variables filtered using a top-hat filter
of varying widths. To test the capability of PCA to extract time-invariant features of
the data set, we can also use a fixed set of modes to reconstruct new, unseen data,
such as data from future time steps of the same temporally evolving system. If the
reconstruction process leads to errors that are comparable with the ones obtained
for the training data, we can expect that PCA captures the essence of the physical
processes underlying the system. We demonstrate this using 2D slices from the DNS
data set from four time intervals separated by ∆t = 5 ms. In Figures 15.4 and 15.5,
three future data snapshots (blue triangles) are reconstructed using q first eigenvectors
found on the initial snapshot (red circles). Figure 15.4 shows the coefficients of
determination, which can be computed for the jth variable X j in the data set as
(xi j − x̂i j )2
ÍN
R2j = 1 − Íi=1 , (15.10)
2
i=1 (xi j − x̄ j )
N
where xi j is the ith observation of the jth variable, x̂i j is the PCA reconstruction of that
observation, and x̄ j is the average observation of X j . The coefficient of determination
R2 measures the goodness of the model fit with respect to fitting the data with the
mean value x̄ j . Values R2 ∈ (−∞, 1], where R2 = 1 means a perfect fit. The smaller
the R2 value gets, the worse the model fit. Figure 15.5 shows the normalized root-
mean-squared errors (NRMSE) on a logarithmic scale. NRMSE can be computed for
the jth variable X j in the data set as
s
2
i=1 (xi j − x̂i, j )
ÍN
1
NRMSE j = · . (15.11)
x̄ j N
In Figures 15.4 and 15.5, the markers represent the mean values of R2 or NRMSE
averaged over all variables in the data set. The bars range from the minimum and
the maximum value achieved for any variable in the data set. It can be observed
that for all the reconstructed snapshots, the error metrics show comparable values.
This indicates the capability of PCA to capture generalized, time-invariant features of
temporally evolving systems. It can also be observed that the mean errors grow for
future snapshots, which indicates that there might be a limit on the time separation ∆t
for which we can extend the applicability of the features found.
where k is the index of the cluster to which the ith observation belongs, zq,i is the
ith observation represented in the local, truncated basis A(k)
q identified on the kth
Figure 15.6 Schematic distinction between global and local PCA on a synthetic
two-dimensional data set. The arrows represent the two global/local modes from the matrix A,
defining the directions of the largest variance in the global/local data.
Initialization Iterating
cluster. Each cluster is centered separately using the centroid c(k) of the kth cluster
and typically the global diagonal matrix of scales D is applied in each cluster.
Data clustering, prior to applying local PCA, can be performed with any algorithm
of choice. One of the techniques discussed in this chapter is the vector quantization
PCA (VQPCA) algorithm (Kambhatla & Leen 1997, Parente et al. 2009), presented
schematically in Figure 15.7. This is an iterative algorithm in which the observations
are assigned to the cluster for which the local PCA reconstruction error of that
observation is the smallest. The hyperparameters of the algorithm include the number
of clusters k to partition the data set, the number of PCs q used to approximate the
data set at each iteration, and the initial cluster partitioning. The latter is predefined
by setting the initial cluster centroids. The most straightforward way is to initialize
centroids randomly, but another viable option is to use partitioning resulting from a
different clustering technique such as K-means (MacQueen et al. 1967). The VQPCA
algorithm iterates until convergence of the centroids and of the reconstruction error is
reached. The reconstruction error is measured between the centered and scaled data
set X
e and the approximation X ei,q using the q first PCs computed from the eigenvectors
found on the ith cluster. More details on the VQPCA algorithm can be found in Parente
et al. (2009) and Parente et al. (2011).
Another possible way of clustering the combustion data sets is to use a conditioning
variable, such as a mixture fraction, and partition the observations based on bins of that
variable. If the mixture fraction is used, observations are first split into fuel-lean and
fuel-rich parts at the stoichiometric mixture fraction Zst . If more than two clusters are
requested, lean and rich sides can then be further divided. The approach of performing
local PCA on clusters identified through binning the mixture fraction vector is referred
to as FPCA.
Local PCA was investigated on the benchmark DNS data set (Section 15.3.1) using
two clustering algorithms. Figures 15.8 and 15.9 show a comparison between two
clustering results in the space of temperature and the mixture fraction variable. Figure
15.8 presents clustering into k = 2 clusters using bins of mixture fraction as the
conditioning variable. This partitioning can be thought of as hardcoded in a sense that
the split into two clusters will always be performed at Zst . The features retrieved on
local portions of data can thus only be attributed to the lean and rich zones. In contrast,
Figure 15.9 presents clustering into k = 4 clusters using the VQPCA algorithm.
VQPCA could distinguish between the oxidizer (cluster k1 ), the fuel (cluster k3 ), and
the region where the two meet close to stoichiometric conditions (clusters k 2 and k4 ).
Figure 15.10 Temperature profile of the DNS data set (left) and the result of partitioning the
data set into k = 4 clusters using the VQPCA algorithm (right).
This is even more apparent if we plot the result of the VQPCA clustering on a spatial
grid in Figure 15.10. The space is clearly divided into the inner fuel jet (k3 ), the outer
oxidizer layer (k1 ), and the two thin reactive layers (k2 , k4 ) for which the temperature
is the highest.
The success of local PCA in extracting features depends on the clustering technique
used. In a previous study (D’Alessio et al. 2020), VQPCA has been compared to other
clustering algorithms, and better results in terms of clustering quality and algorithm
speed have been obtained. An unsupervised clustering algorithm based on the VQPCA
partitioning has recently been proposed (D’Alessio et al. 2020b) to perform data
mining on a high-dimensional DNS data set. If an algorithm such as VQPCA is used,
the types of features found depend on data preprocessing and can additionally depend
on the hyperparameters. By changing the number of clusters or the number of PCs in
the approximation, the user can potentially retrieve different features. This has been
also investigated in a previous study (D’Alessio et al. 2020a) for a more complex,
high-dimensional DNS data set. This is in contrast with clustering strategies based
on binning a single physical quantity, such as mixture fraction or heat release rate.
For instance, taking the heat release rate as an example, we might anticipate that the
cluster formed from the high heat release rate bin will identify the chemically reacting
region.
9 Alternatively, one can also subtract minimum values from each variable, thus making the range of each
state-space variable start at 0.
X ≈ Xq = WFD. (15.13)
The matrix of nonnegative factors F can be regarded as the one containing a basis
(analogous to the matrix A found by PCA). The matrix W represents the compressed
data, namely the NMF scores and is thus analogous to the PCs matrix Z. The
factorization to W and F is not unique, and various optimization algorithms exist
(Berry et al. 2007, Lin 2007). In this chapter, we use the MATLAB
R
routine nnmf,
which minimizes the root-mean-squared residual (Berry et al. 2007),
1
D= √ ||X − WF||F , (15.14)
N ·Q
starting with random initial values for W and F, where the subscript F denotes the
Frobenius norm. This optimization might reach local minimum and thus repeating the
algorithm can yield different factorizations.
Similarly to what was done in PCA, we can look at the nonnegative factor weights
(the elements of F) to assign physical meaning to the factors. With the nonnegative
constraint on F, only nonnegative weights can be found. Figures 15.11 and 15.12 show
the first two nonnegative factors that together represent the mixture fraction variable
(compare with Figure 15.2). The oxidizer components are included in the first factor
and the fuel components in the second. If NMF is compared to PCA, the latter can
be thought of as more robust since NMF required two modes to capture the same
information (the mixture fraction variable) that was included in a single PCA mode.
Autoencoders
The autoencoder (Goodfellow et al. 2016, Wang et al. 2016) is a type of an
unsupervised artificial neural network (ANN) whose aim is to learn the q-dimensional
representation (embedding) of the Q-dimensional data set such that the reconstruction
error between the input and the output layer is minimized. The standard form of an
autoencoder is the feedforward neural network having an input layer and an output
layer with the same number of neurons, and one or more hidden layers. Given one
hidden layer, the encoding process takes as an input the preprocessed data matrix
e ∈ R N ×Q and maps it to H ∈ R N ×q , with q < Q:
X
H = f (XG
e + B), (15.15)
where the columns of H are referred to as the codes, f is the activation function such
as sigmoid, rectified linear unit (ReLU), or squared exponential linear unit (SELU),
G ∈ RQ×q is the matrix of weights and B ∈ R N ×q is the matrix of biases. At the
decoding stage, H is mapped to the reconstruction Xeq ,
eq = f 0(HG0 + B0),
e≈X
X (15.16)
In this section, we use an autoencoder with five hidden layers and SELU activation
function and generate a two-dimensional embedding (q = 2) of the original data
set. Figure 15.13 shows the two-dimensional manifold obtained after the autoencoder
compression (represented by the matrix H). The manifold is colored by the previously
obtained result of partitioning via the VQPCA algorithm. From Figure 15.13, it can be
observed that the result of VQPCA partitioning still uniformly divides the autoencoder
manifold. Clusters k 1 and k 3 , representing the oxidizer and fuel respectively, are
located at the opposing ends of the manifold. Thus, it is possible to think of that
manifold as describing the progress of the combustion process, with the fuel and
oxidizer meeting in the center (k2 and k4 ) where they finally react.
where wi j is the ith weight on the jth factor and w̄i2j is the mean of the squared weights
(Abdi 2003). After rotation, those n factors explain the same total amount of variance
as they did together before the rotation, but variance is now redistributed differently
among the selected factors. Varimax-rotated factors typically have high weights on
fewer variables than the original factors that can aid in their physical interpretation.
The rotated factors can then be used as the new basis to represent the original data
set. Varimax and several other rotation methods are available within the MATLAB
R
routine rotatefactors.
Another interesting technique is the Procrustes analysis (Seber 2009), which
is a series of linear operations that allow translation, rotation, or scaling of the
low-dimensional manifold. This can be particularly useful when manifolds obtained
from two different dimensionality reduction techniques should be compared.
Figures 15.14 and 15.15 present the Procrustes transformation (using the MATLAB R
In Section 15.2, we have seen that the number of thermochemical state-space variables
Q determines the original dimensionality of the data set. This number also reflects
how many transport equations for the state-space variables should be solved in
a numerical simulation. The general transport equation for the set Φ = X> =
[T, p,Y1,Y2, . . . ,YNs −1 ]> of state-space variables is
DΦ
= −∇ · (jΦ ) + SΦ,
ρ (15.19)
Dt
where jΦ is the mass-diffusive flux of Φ relative to the mass-averaged velocity, and
SΦ is the volumetric rate of production of Φ (also referred to as the source of Φ).
Performing detailed simulations with large chemical mechanisms with significant
number of chemical species is still computationally prohibitive. Sutherland and
Parente (2009) proposed to use PCA to reduce the number of transport equations
that solve a combustion process. Instead of solving the original set of Q transport
equations, the original variables are first transformed to the new basis identified by
PCA on the training data set X. Next, the truncation from Q to q first PCs is performed.
Transport equations for the q first PCs can be formulated from (15.19) using the
truncated basis matrix Aq ,
Dz
ρ = −∇ · (jz ) + Sz, (15.20)
Dt
where z = Zq > (with Zq = (X − X̄)D−1 Aq ), jz = Aq jΦ , and Sz = Aq SΦ D−1 (also
referred to as the PC-sources). This is the discrete analogous of the Galerkin projection
methods described in Chapters 6 and 14. Note that the source terms of the PCs, Sz , are
scaled (but not centered) using the same scaling matrix D as applied on the data set X.
The challenge associated with the resolution of the PC-transport equation is related
to the PC-source terms. The latter is highly nonlinear functions (based on Arrhenius
expressions) of the state-space variables. The nonlinearity of the chemical source
terms strongly impacts the degree of reduction attainable using the projection of
the species transport equations onto the PCA basis (Isaac et al. 2014, 2015). A
solution to this problem is to use PCA to identify the most appropriate basis to
parameterize the ELDM and then, both the thermochemical state-space variables and
the PC-source terms can be nonlinearly regressed onto the new basis. This allows
us to overcome the shortcomings associated with the multilinear nature of PCA and
the training manifold during the simulation, indicating that the choice of an unsteady
canonical reactor ensures to span all the potential chemical states accessed during
the simulation. The strength of the method resides in the fact that it does not require
any prior selection of variables. Instead, it automatically extracts the most relevant
variables to describe the system. From this perspective, the PC–GPR method can
be regarded as a generalization of tabulated chemistry approaches (Pope 2013),
particularly for complex systems requiring the definition of a larger number of
progress variables.
In Section 15.4, PCA was used to derive the reduced number of transport equations for
the new set of variables, PCs. Regression was introduced to handle the nonlinearity
of chemical source terms. In this section, we investigate the potential of local PCA
(Section 15.3.2) to classify the thermochemical state-space into locally homogeneous
regions and apply chemical mechanism reduction locally.
where the indices a and b denote the separate contributions to Ψ. In particular, three
sub-steps are adopted for the numerical integration:
1. Reaction step: The ODE system corresponding to the source term S(Ψa ) is
integrated over ∆t
2 .
2. Transport step: The ODE system accounting for the convection and diffusion terms
C(Ψb , t) and D(Ψb , t) is integrated over ∆t.
2 .
3. Reaction step: The source term S(Ψa ) is again integrated over ∆t
Unlike the transport step, the two reaction steps do not require boundary conditions
and they do not have spatial dependence. The system of N ODEs from steps 1 and 3
can be solved independently from the system of N ODEs from step 2: this makes the
adaptive-chemistry techniques very effective and easy to implement. The idea behind
the adaptive-chemistry approach is that it is possible to locally consider only a subset
of the chemical species implemented in the detailed mechanism, while the remaining
subset consists of the species that locally have zero concentration, or result to be
Figure 15.21 Temperature, O2 , OH and CO profiles obtained from the detailed and the
adaptive-chemistry simulation of an axisymmetric, non-premixed laminar nitrogen-diluted
methane coflow flame.
If multidimensional simulations of the same system are not available, the SPARC
model can be trained using lower-dimensional (0D or 1D) detailed simulations of
the same chemical system. Reduced mechanisms can then be generated based on such
training data sets and applied in the multidimensional (2D or 3D) adaptive simulations.
D’Alessio, Parente, Stagni and Cuoci (2020) also considered a data set consisting
15.7 Summary
way and can thus aid in the process of selecting the best set of variables to effectively
parameterize complex systems using fewer dimensions. Moreover, it has been shown
how the aforementioned algorithms can be effectively coupled with algorithms that
are widely used in the combustion community, such as DRGEP, to obtain physics-
informed reduced-order models. Two applications of reduced-order modeling were
presented in this chapter: reduction of the number of transport equations (Section 15.4)
and reduction of large chemical mechanisms (Section 15.5). The main power of data-
driven techniques is that the modeling can be informed by applying the technique on
simple systems that are computationally cheap to obtain. Further improvement can be
achieved based on the feedback coming from the validation experiments.
Reduced-order models (ROMs) for both steady and unsteady aerodynamic applica-
tions are presented. The focus is on compressible, turbulent flows with shocks. We
consider ROMs combining proper orthogonal decomposition (POD), Isomap, which
is a nonlinear manifold learning method, and autoencoder networks with interpolation
methods. Physics-based ROMs, where an approximate solution is found in the POD-
subspace by minimizing the corresponding steady or unsteady flow-solver residual,
are also being discussed. The ROMs are used to predict unsteady gust loads for rigid
aircraft as well as static aeroelastic loads in the context of multidisciplinary design
optimization (MDO) where the structural model is to be sized for the (aerodynamic)
loads. They are also used in a process where an a priori identification of the critical
load cases is of interest and the sheer number of load cases to be considered does not
allow for using high-fidelity numerical simulations. The different ROM methods are
applied to 2D and 3D test cases at transonic flow conditions where shock waves occur
and in particular to a commercial full aircraft configuration.
1 The author would like to thank Airbus for providing the XRF-1 testcase as a mechanism for
demonstration of the approaches presented in this chapter.
computational fluid dynamics (CFD) in this context is at the horizon, but still too
costly and time-consuming to provide all the required aerodynamic data, that is, steady
and unsteady pressure and shear stress distributions on the aircraft surface, at any point
within this envelope. This motivates procedures and techniques aimed at reducing the
computational cost and complexity of high-fidelity simulations to provide accurate but
fast computations of, for example, the aerodynamic loads and aircraft performance.
A classical approach to reduce the numerical complexity is to simplify the physics.
An example of this is the common use of linear potential flow equations during
loads analysis. However, such physical model simplifications have the disadvantage
of neglecting significant effects such as transonic flow, stall, and friction drag in the
case of aerodynamics. This may be acceptable early on in the design process, while
more detailed analysis may be applied at a later stage when the design space has been
narrowed down sufficiently.
As an alternative to simplifying the physical model, reduced-order modeling
(ROM) provides another approach to reduce numerical complexity and computational
cost while providing answers accurate enough to support design decisions and to
perform quick trade studies. In general, the various ROM methods realize such a
goal by identifying a low-dimensional subspace or manifold based on an ensemble
of high-fidelity solutions “snapshot,” which sample a certain parametric domain of
interest. The number of degrees of freedom (DoF) is then reduced while retaining
the problem’s physical fidelity, thus allowing predictions of the required aerodynamic
data with lower evaluation time and storage than the original CFD model. ROMs
are anticipated to enable incorporating high-fidelity-based aerodynamic data earlier
into the design process. Methods for minimizing the number of expensive high-
fidelity simulations needed to extract a reduced-order model from simulation data
are sought after, including adaptive sampling strategies. Data classification methods
are of interest to gain more physical insight, for example, to identify (aerodynamic)
nonlinearities and to track how they evolve over the design space and flight envelope.
Finally, data-driven methods, including machine learning methods stemming from
other fields of research, that are able to extract relevant information from high-fidelity
numerical and experimental data and for merging heterogeneous data can help to
arrive at a more consistent, homogeneous digital aircraft model. This chapter focuses
on reduced-order modeling approaches based on high-fidelity CFD in the context
of aerodynamic applications and multidisciplinary analysis and optimization. The
emphasis is on the basic ideas behind the methods, their accuracy and computational
efficiency, and their application to industrial aircraft configurations. The mathematical
formulation of the methods can be found in the corresponding references.
The basic ideas behind reduced-order modeling based in POD and Isomap are
explained briefly in the next section, followed by a description of their implemen-
tation. The last section is concerned with their application to different two- and three-
dimensional use cases.
partial differential equations (PDEs) onto the POD subspace to obtain a system of
ordinary differential equations (ODEs).
POD+I is a nonintrusive method as the interpolation technique does not require
any details on the underlying governing equations. ROM-predicted solutions are
determined by directly interpolating the coefficients of the POD modes, without
the need to solve any ODE system. It generally establishes a multidimensional
relationship between the modal coefficients or amplitudes and the parameter space,
for example, by fitting a radial basis function in the modal space to the set of snapshot
points in the parameter space. This requires simple interpolation of scalar-valued POD
basis coefficients to get a surrogate model of these coefficients as a function of the
parameters. To predict a flow solution at an untried combination of parameters, the
surrogate is evaluated for these parameters, and the predicted POD coefficients are
multiplied with the corresponding precomputed, invariant POD modes. This has the
advantage of simplicity of implementation and independence of the complexity of
the system and source of the modes being processed, which allows for application
to multidisciplinary problems and the combination of different data sources such
as CFD and experimental test results. The main disadvantage of nonintrusive POD
methods stems from their reliance on interpolation techniques to accurately reproduce
the possibly very nonlinear response surfaces of the modal coefficients.
Intrusive POD-based methods do better in this respect. This is done by solving
a nonlinear least-squares (LSQ) optimization problem for the modal coefficients
minimizing the steady (or unsteady) flow-solver residual of the governing equations
in POD subspace (Zimmermann & Görtz 2010). For the semi-discrete Navier–Stokes
equations,
∂w (t)
R̂ := R (w (t)) + = 0 ∈ RN , (16.1)
∂t
with a being the vector of the unknown POD coefficients ai and w the mean of the
snapshots set. A greedy missing point estimation is used to select the subset. This
approach is an alternative to POD-based ROMs based on Galekin–projection. In the
following, such approaches will be referred to as POD+LSQ or LSQ-ROM.
16.3 Implementation
b) ISOMAP
Mach number
Define training Compute training Perform order
points outputs with CFD reduction
4 5
a) Interpolation
b) Residual minimization
c) Combined approach
16.4 Applications
Figure 16.2 XRF-1 generic long-range transport aircraft (left) and NASA’s Common Research
Model (CRM) (right).
Figure 16.3 Steady Euler computation for the LANN wing compared to intrusive and
nonintrusive ROM predictions based on Isomap and POD, M = 0.81, α = 2.6◦ .
space. When combined with an interpolation model between the parameter space and
the embedding space, similar to POD with interpolation (POD+I), an Isomap-based
ROM (Franz et al. 2014), called Isomap+I, is obtained.
Parametric ROMs for the steady transonic flow around the LANN wing were gen-
erated. The LANN Wing is a well-known supercritical research wing model, built by
the Lockheed-Georgia Company for the U.S. Air Force for testing at NLR and NASA
Langley in 1983; the best reference is: LANN (Lockheed, AFWAL, NASA-Langley,
and NLR) Wing Test Program: Acquisition and Application of Unsteady Transonic
Data for Evaluation of Three-Dimensional Computational Methods (dtic.mil). Here,
the ROMs were parametric with respect to the flow conditions. The parameter space,
defined by variations of the angle of attack, α and the Mach number, Ma, was
previously sampled in the range [α × Ma] = [1◦, 5◦ ] × [0.76, 0.82] using a “design of
experiment” (DoE) based on a Latin-Hypercube sampling approach with 25 different
α-Mach-combinations. At these parameter combinations, the corresponding CFD
solutions were computed and used as input data for both ROMs. Figure 16.3 provides
a comparison of the predictions of Isomap+I and POD+I with the computed TAU
reference solution for the pressure coefficient at three wing sections for the LANN
wing in inviscid flow. The two different curves for the pressure coefficient distributions
correspond to the pressure and suction sides. The required parameters for Isomap+I
were determined automatically, yielding a two-dimensional embedding space and
seven nearest neighbors to be ideal for the embedding. Isomap+I yields better
predictions than a POD-based interpolation method, in particular for the location and
The structural sizing leads to a wing mass of 3 761 kg when sizing with TAU
reference loads, 3 776 kg when sizing with the POD-based predictions, and 3,755 kg
when sizing the wing with Isomap-based predictions. The difference in wing mass is
therefore only 0.4% and 0.1%, showing that ROM-based predictions of aerodynamic
loads are of sufficient accuracy in the context of structural sizing.
Thickn
340 0
Data-Driven Fluid Mechanics Chapter 16
240 250 260 270 280 290 300 310 320 330 340 350
Opt region
·10−2
2 TAU
POD+I
Thickness (m)
1
0
240 250 260 270 280 290 300 310 320 330 340 350
Opt region
Figure 7.24. Thicknesses at the wing spar optimization regions for the
16.5 Comparison
predictions
Figure of the (top)
of Isomap+I resultsand
optimizing
POD+Ithe differentatregions
(bottom) of the wing structure
p3 . Emphasized are of
the rigid XRF-1 with
the regions withthe aerodynamic
largest gap. loads predicted by the POD-based ROM (red bars) and
with the reference loads computed with the CFD code TAU (blue bars).
there are three wing spars, the thickness distribution plotted against the optimization
16.4.3 region does not showModeling
Reduced–Order an decreasingfor behaviour as usual, butProblems
Static Aeroelastic for each separate spar it
does. However, close to the optimization region 320, there is an outlier which may
Steady
be due toROMs were
the fact thatused to predict
the inner spar, the
withstatic aeroelasticoptimization
corresponding loads for structural
regions sizing
288
to 303, ends and the loads are distributed to the two remaining spars. 2018).
within a multidisciplinary design and optimization context (Ripepi et al. They
As it can
were also used in a process where an a priori identification
be seen, Isomap and POD-based ROMs provide good predictions of the thickness of the critical load cases
is of interest at
distributions andthe
thewing
sheerspar
number of load cases
optimization to beParticularly,
regions. consideredthe cannot be computed
detailed views
with high-fidelity
emphasize CFD.
that there is almost no mismatch between the computed thicknesses of
the Within
predictedthesolutions
framework andofthe
MDO, the CFD-code
computed thicknessesTAU of and a computational
the corresponding structural
reference
solutions.
mechanicsThus, (CSM)thesesolver
ROMsare are used
suitable
to in the context
perform of MDO
coupled and should lead
fluid–structure to a
interaction
speed-up of the optimization process as mentioned above.
(CFD–CSM) simulations to compute aerodynamic loads acting on the wing, which
are in turn used to design or optimize the structure (structural sizing). Since TAU
is repeatedly called for different structural models and different wing geometries to
evaluate the load cases during the optimization, the flow solver should be replaced by
a steady ROM for the elastic aircraft. Replacing the flow solver with a ROM has the
advantage that the snapshots necessary to build the ROM can be calculated offline, that
is, before the actual optimization begins. Once the snapshots have been computed, the
ROM itself can also be generated offline. During the optimization, the much-faster
ROM replaces the flow solver. This will help to speed up the overall design and
optimization process and allow more load cases to be considered in the sizing of the
structure. To set up the ROMs, snapshots (TAU solutions) for different aerodynamic
shapes and structural modes are required. Here, a ROM for the static aeroelastic load
prediction of the XRF-1 wing-fuselage configuration was created, that is, the shape
was not varied.
For this purpose, 21 coupled computations with a structured mesh suitable for
RANS simulations with one million mesh points (Figure 16.6, left) and an ANSYS
finite element model with 2,167 elements (Figure 16.6, right) for different altitudes
and Mach numbers at a load factor of 2.5g were performed to create snapshots for the
wing deformation and the aerodynamic load in terms of the pressure distribution. The
21 points in the parameter space are shown in Figure 16.7 as blue symbols.
h,
for deformation
a given aerodynamic of shape
the CFD (CFD) DLR TAU solver 3,000
mesh [38, 60]. The computational struc- using two nodes1 with 24 cores. Computing a free-flight
ural sizing sub-process. tural
Fig. 5model (CSM),
Structural finitehowever, is spatially
element model discretized
of the wing, showingbythe
finite
front,
of the flow solution (with inner
rear and
element middle
shells spars, and thewith
and beams ribs 2167 elements (1914 shells,
dsstrategy),
of a genericinterpolation
long- 212ofbeamstheand 41 mass type elements) and 1725 nodes, as 1
InteÒ XeonÒ E5-2695 v2 Processors (30M Cache, 2.40 GHz, 12
is used for
showed the 5.
in Fig. finite
Theelement2,000
ANSYS analysis.
StructuralThe couplingsoftware
Mechanics between Cores).
raft
del to the CSM mesh, and com-
the aerodynamic and the structural model is obtained through
olution.
mances and effectiveness of the the use of Radial Basis Functions (RBFs) [8] to transfer the
123
g static outer
aeroelastic aerodynamic loads to the structural model, and a linear
loads for ais necessary
pling iterations interpolation of the 1,000
structural displacements onto the aero-
rt aircraft fuselage-wing con-
owconvergence.
regime are presented. In each of these
dynamic grid.
The aircraft empty mass is 117,888.2 kg. A flight con-
trimmed solution is obtained
nfiguration and the underlying
figuration with 55,000 kg of payload mass and 61,000 kg of
384 grid points, including 19,211
gy,
ted inwhere
Fig. 4. Thethe angle of attack
aerodynamic fuel mass is is considered. Pull-up 0 and pull-down maneuvers,
0.6g and - 1 0.65 0.7 0.75 0.8 0.85
dift balancing
Navier–Stokes aircraftof weight
the model
(RANS) respectively load factor 2.5 g, are analyzed.
aras type 1-equation turbulence The computation of the coupled flow-structure solutions Mach
to a given load factor. were performed in parallel on the DLR C2 A2 S2 E-2 cluster
e computational fluid-dynamics
stem is computational
, 60]. The solved bystruc- first using
using two anodes1 Fig. with 624Design
cores. Computing a free-flight
of experiment samples and prediction points
is spatially discretized by finite
Figure 16.7 Sample locations for ROM generation in the parameter space (blue) spanned by
gy, running 3500 iterations, fol-
ith 2167 elements (1914 shells, Mach number and altitude and prediction points (red). Reprinted from Ripepi et al. (2018) by
idual strategy,
e elements) and 1725 where
nodes, as thepermission
density
InteÒ XeonÒ from
1
E5-2695Springer
v2 ProcessorsNature.
(30M Cache, 2.40 GHz, 12
Sfour orders
Structural of magnitude
Mechanics software at the
Cores). retained. The performances of the ROM approach are
up to six orders of magnitude evaluated at flight conditions with Mach number 0.81 for
different altitudes. The 123 prediction points and the DoE
upled iterations. However, Thethecoupled calculations were carried out in such a way that the lift counteracts
y is bounded by a maximum of and inertial forces due to theFig.
the weight
sample points are shown in 6.
load factor, which corresponds to the ratio
Before computing the ROM predictions, a leave-one-out
of the lift of cross-validation
an aircraft to its strategyweight, hasand been
represents a global
performed measure of the stress or
to understand
ented, the ROM is parametrized
“load” to which the structure of the aircraft is
if the set of sample points were enough to cover subjected. The snapshots
the were used to
s, i.e., altitude and Mach number.
set up a ROMparameter
based on space. POD orTherefore,
Isomap and thin-plate spline (TPS)
following this strategy, alter- interpolation. The
conditions and configurations
calculation ofnatelya snapshot
one of with TAU and ANSYS
the high-fidelity snapshotstook
of 60
theminutes
DoE sampleon 48 cores, while
or the snapshot computation is a
building the ROM. In this case, the generation set ofhasthe ROM
been left took
out one
from minute
the ROM on one core
generation and the
procedure ROM prediction
elity simulations have been per- of the surface pressure
(which is distribution
then built and
using deformation
the remaining were
21 in
DoE real time
sample on one core.
h numbers ranging from 0.65 to snapshots as the training set, retaining all the 21 POD
ween 0 and 5000 m. The payload modes). In the corresponding flight condition of the left-out
t fixed The sample points are snapshot (i.e., the validation point), the ROM prediction
h d i 0 0 9 8 088962 4 023 bli h d li b C b id i i
342 Data-Driven Fluid Mechanics Chapter 16
M. Ripepi et al.
Cp Z
Cp
y/b = 25%
Cp
Cp
y/b = 50%
y/b = 25%
y/b = 75%
y/b = 50% Y Z
X
0 0.2 0.4 0.6 0.8 1 y/b = 75%
x/c
Fig. 12 Comparison of the pressure distribution at the 25, 50 and 70% spanwise airfoil sections, for Mach 0.81 and altitude 4000 m
sections for a Mach number of 0.81 and 4 000 m altitude (visualization on jig-shape).
Hi-Fi predicted sizing load cases
ROMs predicted sizing load cases
it
Reprinted from Ripepi
CFD et al. (2018) by
Structure permission from Springer Nature.
Sizing ed
lim
Reduced-order models for aerodynamic applications, loads and MDOve s
pe
di
altitude
g
-1
400 Snapshots 1 km d,
ee
l sp
360
st
al
mass cases has been
g
.5
predictions
2
ROMs Sizing
d,
aerodynamic load ca
e
pe
ls
al
st
elasticthe
This is indeed axis is determined,
result that was reached the inso-called
the synthetic modes, which are an established
present case,
conceptas illustrated
for unsteady by the optimization
aeroelastic Fig. 18. Comparison of aeroelastic polars of baseline and
investigations including low-fidelity aerodynamic
convergence plot in Fig. 17. The objective is the optimized designs
models (Voss
range coefficient, whichetisal.equal
2011).to the product
Second,
of the lift-to-drag samples
ratio are computed
and inverse of the mass with TAU for the different synthetic mode shapes
coefficient.
based Theon objective was increased
a Halton sampling strategy 7 Conclusions
by (Kuipers & Niederreiter 2005), and a POD basis
about 6% from the which
is computed, baseline, which the
contains was deformations
DLR has corresponding
developed atocollaborative
the mode shapes MDO
achieved both by increase in lift-to-drag ratio platform based on DLR standards and more than
and by as well as
decrease the mass
in the surface forces computed
coefficient. The with TAU. Finally, surface forces for
40 established disciplinary tools. A multi-level
a
total rungiventimedeformation of interestwas
for this optimization can 80
then rapidly be obtained by solving a least-squares
MDO architecture has been implemented,
hours, using 96 computing
problem, minimizing cores
the of an HPC between
difference the target
including an surface
overall deformation
aircraft synthesisgiventool
by
cluster. theIn current
total 230 CFD-CSM
structural modelsimulations
of interest and theandeformation
chain, entries
automatic loads within
process basedtheonPOD
mid-
(load cases and performance cases) were fidelity aerodynamics to identify configuration-
performed. modes to find a set of POD coefficients. The resulting POD coefficients can then be
specific load cases, aerodynamic performance
used to compute the surface force by a simple matrix-vector product. A more
computations for the aeroelastic aircraft with
in-depth
theoretical introduction, including synthetic mode generation and a formulation
automatic hybrid unstructured meshing for of
the constraint least-squares optimization problem, can be found in Bekemeyer et al.
(2019). Here results are presented for a long-range passenger aircraft with a semi- 10
tion in each iteration step to minimize weight while respecting stress constraints. The
final deformation behavior of the wing for a transonic load at M = 0.784 is compared
between the proposed reduced-order model approach and a fully coupled CFD–CSM
analysis with excellent agreement. Also final skin thicknesses show only minor devia-
tions around the engine pylon wing junction. However, the online cost for performing
the ROM-based analysis is a factor of nearly 2 000 faster than its full-order equivalent.
ncoders
Decoder
Reduction, i.e.
: x ∈ Rn 7→ x̃ ∈ Rñ ,
n1
multilayer artificial
k n2
0.8
Mach
0.78
0.5
0.76
Z
0
-0.5
0.74
0 0.5 1
X
4 5 6 7 8 9 10
α
www.DLR.de • Folie 31 > ROMs > Franz • 30. April 2013 NACA64A010 airfoil
• Snapshots
Results × Prediction points
Tau
POD
C-POD
−1
AE
Isomap
α = 8.85
Cp
0
Mach = 0.79
Gust amplitude vg
Training input 2 Training output 3
1
𝐂L (Lg, vg)1
(Lg, vg)X (Lg, vg)2
(Lg, vg)3
4 5 6 ROM predictions
POD model
𝒘 𝒕 = 𝚽𝐚
𝚽:POD-modes
Use residual min. for 𝐚
𝑚𝑖𝑛 ∥ 𝐑 𝚽𝐚 + 𝐰 ∥2L
𝐚 2
to obtain a subspace representation using POD. Second, a subset of the grid points
is computed by combining the classical Discrete Empirical Interpolation Method
method and a greedy missing point estimation. This subset of grid points is used
to restrict the evaluation of the unsteady flow solver residual to a small number of
points during the online prediction phase for reasons of computational efficiency
(Bekemeyer et al. 2019) (step 5 in Figure 16.14). The resulting model can then be
used to simulate gust response behavior at varying parameter combinations such as
gust length and amplitude by minimizing the unsteady residual on the subspace at
each time step. ROM results are compared to a time-marching reference solution
for a 1-cos gust with length of 213 m and amplitude as defined by international
certification authorities at cruise conditions for NASA’s CRM, compare Figure 16.2.
In addition, the linearized frequency domain solver (LFD) based on the TAU code
(Thormann & Widhalm 2013) was adopted to efficiently predict the dynamic behavior
of the CRM assuming transonic and mildly separated flow and dynamically linear
response behavior. The LFD was originally developed, offering a significant reduction
in computational effort for small-perturbation problems while retaining the fidelity
of the RANS flow solutions. The lift coefficient, pitching moment coefficient, and
the differences in surface pressure distribution at the time step corresponding to
the maximum lift coefficient are shown in Figure 16.15. Throughout, the unsteady
ROM accurately predicts the full-order model (FOM) behavior, whereas the linearized
approach deviates once flow separation occurs. Significant discrepancies between the
three approaches are observed in the surface pressure distribution.
Figure 16.15 Comparison of gust response behavior of the CRM for the full-order model
(FOM), unsteady nonlinear ROM and linear frequency domain (LFD) solver. Lift and pitching
moment coefficient (left), surface pressure differences at maximum lift coefficient (right).
16.5 Conclusions
Parametric ROMs based on both POD, cluster POD, and Isomap were developed and
successfully applied to a variety of two- and three-dimensional test cases, including
two industrial aircraft configurations, to predict aerodynamic loads over a range of
flow or operating conditions as well as for changing geometry parameters. Isomap
is a dimensionality reduction technique, which is based on a nonlinear “manifold
learning” method. In this method, it is assumed that the CFD solutions form a
nonlinear manifold, which in turn constitutes a sub-manifold of lower intrinsic
dimensionality. Isomap yields better predictions than a POD-based ROM method,
in particular for the location and magnitude of the shock wave at transonic flow
conditions. This is due to the fact that Isomap makes predictions based on a local
subspace, similar to cluster POD, while POD-based predictions make use of a global
subspace.
17.1 Introduction
Controlling turbulence for engineering goals is one of the oldest and most fruitful
academic and technological challenges. Engineering applications have epic propor-
tion. The infrastructure of any industrial nation requires flawless pipe systems with
turbulent flows of drinking water, gas, oil, and sewage. Drag reduction in these pipe
flows spells energy saving for operation. Transport-related examples include drag
reduction of road vehicles, airborne transport, ships, and submarines, lift increase
of airfoils, and associated noise reduction. Energy and production-related examples
include efficiency increase of harnessing wind and water energy, heat transfer,
chemical, and combustion processes.
Closed-loop active control is increasingly investigated for performance improve-
ments in industrial applications. Several trends support this development. First,
aerodynamic shapes and passive control have been optimized for over 100 years.
Thus, further optimization efforts may yield diminishing returns. Second, in many
cases the engineering challenge is not the cruise or normal operating condition but
short-term and rare events. For wind turbines, the effect of gusts needs to be mitigated.
Aeroengines need to be safe during 60 to 90 seconds of takeoff, where they can
produce six times the thrust required during cruise. The large nacelle is only required
for this takeoff and may be replaced by a smaller version with active control. Third,
actuators and sensors become increasingly cheaper, more powerful, more reliable, and
thus more economical. Finally, modern control methods can harness increasingly more
complex dynamics, last but not least through the powerful augmentations by ML.
Goals, tools, and principles of turbulence control are reviewed in Section 17.2.
This chapter focuses on the control logic. The classical paradigm is from under-
standing to dynamic modeling to model-based control design to controller tuning in
the full plant. Key applications are first (relaxational) and second-order (oscillatory)
dynamics. The nonlinear dynamics need to be linearized or at least sufficiently
understood. In Section 17.3, we present simple human-interpretable POD-based
models showing the spectrum from linear to strongly nonlinear dynamics explaining
a large range of current experiments.
Building a control-oriented model is a challenge for shear turbulence with a
large range of temporal and spatial scales and their nonlinear interactions. The
challenge becomes even larger for spatially distributed control with many actuators
and many sensors. Here, powerful methods of ML may fully invert the classical
control paradigm. First, the near-optimal control law is automatically learned in the
full plant without any model. Then, simpler control laws with similar performance
are identified. The performance-winning mechanism is then dynamically modeled,
leading to a deeper understanding of the flow. In Section 17.4, different realizations
of this MLC are described. The chapter concludes with a MLC tutorial for the control
of nonlinearly coupled oscillators (Section 17.5). For more information, we refer to
our recent reviews (Brunton & Noack 2015, Brunton et al. 2020) and our textbooks
on model-based control (Noack et al. 2011) and MLC (Duriez et al. 2017) and the
literature cited therein.
17.2.1 Goals
Turbulent flows around cars, trucks, trains, airplanes, helicopters, ships, and sub-
marines affect their performance. An important engineering goal is drag reduction
or, more precisely, the reduction of net propulsion power and thus energy saving.
Airplanes during takeoff and landing require high lift at low velocities. Here, a
goal is to achieve lift increase with minimal means, for example, low extra weight
and small material fatigue. During cruise, the increase of lift-to-drag ratio leads to
reduced fuel consumption, and hence reduced environmental impact and increased
profitability. Ground and airborne traffic produce undesirable noise. Here, noise reduc-
tion is another important goal. Chemical production, combustion processes, and heat
exchangers profit from mixing increase. Another important goal is the stabilization
of flow oscillations. Examples include flutter and buffeting of airfoils, cavity noise
(Rossiter modes) associated with landing gears of airplanes and detrimental vortex
shedding around buildings, chimneys, bridges, the pillars of oil-producing platforms,
underwater pipelines, and heat-exchanging pipes in nuclear plants.
The goals of drag reduction, lift increase, mixing increase, noise reduction, and
oscillation mitigation may be served by separation prevention, skin-friction reduction,
and manipulation of free shear flows. As a rule of thumb, in most flow control strate-
gies the transverse mixing of free and wall-bounded shear flows is increased (drag
reduction, mixing increase) or decreased (skin friction reduction, noise reduction,
oscillation prevention). We refer to the excellent discussion of underlying flow physics
by Gad-el Hak (2007).
17.2.2 Tools
Flow control performance during cruise or standard operating conditions is well
served by shape optimization under given constraints. An ubiquitous example is cars,
which have become increasingly more aerodynamic over the past hundred years.
The performance may be further improved by passive devices, that is, small changes
that require no energy input. Feathers at the end of arrows stabilize the flight and
increase the range – perhaps the first example of man-made passive control. Another
example is vortex generators on the wing of a Boeing 737 to prevent early separation
with associated lift decrease and drag increase. The efficiency of wind turbines may
also profit by up to 7% from SmartBlade vortex generators. Riblets reduce skin-
friction drag by up to 11%. The chevron-shaped engine nozzles reduce jet noise by 1–3
dB. Spoilers at the trailing edge of a car stabilize aerodynamic forces and reduce drag.
Vanes in ducts and at airplanes redirect the flows in a performance increasing way.
Passive devices may come with parasitic drag during cruise or standard operating
conditions. In many cases, passive devices, like vortex generators, may be emulated by
active (energy-consuming) devices, like fluidic jets. Active control has the advantage
that it can be turned on or off and even operate at a large range of amplitudes and fre-
quencies. This dynamic range often yields performance benefits in contrast to steady
operation. Active control may be operated open-loop, that is, be blind to the flow
state. Yet there may be significant potential in combining unsteady active control with
sensor-based flow monitoring. In the sequel, we will focus on this closed-loop control.
The choice of the actuators, their kind, amplitude, frequency range, and placement
has a large effect on control efficiency. Actuators are often placed at high-receptivity
locations, for example, at the trailing edge of a bluff body, where the separating shear
layer can be easily manipulated. The coherent structures associated with boundary-
layer transition and skin friction can be directly opposed with wall motion or wall
actuation. The actuation may be based on blowing, suction, and zero-net mass flux,
each having its distinct advantages and disadvantages.
Yet, this choice is largely based on engineering experience with little guidance from
mathematical and physical methods. The closest to a first-principle based method is
the volume force optimization of linearized Navier–Stokes dynamics. Making a good
choice of actuation is a challenge. The scaling properties to full-scale engineering
configurations is the next task, again with no or little guidance from available
theories.
Optimizing sensor placement is easier and may be guided by a direct numerical
simulation or particle image velocimetry measurements. Yet, the choice of the
optimization criterion is far from trivial. The chosen sensors may, for instance, be
placed at locations to optimize (1) linear flow estimation, (2) linear-quadratic flow
estimation, (3) Kalman filter-based flow estimation, (4) a dynamic observer based
on a reduced-order model, (5) the estimate of the observable subspace (balanced
truncation), or, in the most general case, (6) the feedback control performance for
all considered control laws. In the following, we assume a specified control goal and
a given actuator and sensor configuration.
17.2.3 Principles
The choice of the actuators and sensors is – lacking a computationally accessible
mathematical method – always guided by a conceptual idea of the control mechanism.
A long list of hypotheses can be produced. Skin-friction reduction relies on the
suppression of sweeps and ejections (opposition control). Separation can be delayed
by increasing the turbulence level (destabilizing control). Heat exchangers also rely
on mixing increase. Vortex shedding behind bluff bodies can be mitigated with phasor
control, or exciting larger or smaller frequencies and blowing the wake away (Coanda
blowing, base bleeding). A few general dynamics principles can be distilled.
Kill the monster while it is little! The ideal scenario of stabilizing control is that the
instability is successfully fought in statu nascendi. This requires vanishing
energy in a noise-free environment and a noise-dependent level otherwise.
Examples are mitigation of a Tollmien–Schlichting instability with local
opposition control, the reduction of skin-friction drag by opposing the wall-
normal velocity of sweeps and ejections and Coanda blowing / flow vectoring
to symmetrize the wake.
Support the enemy of your enemy! In some cases, the actuation may be too weak
or distant for a direct stabilization. We might exploit that it is much easier
to excite a coherent structure than to prevent an instability. Vortex shedding
behind a cylinder may be mitigated by exciting shear-layer vortices at high
frequency. In this case, the shear-layer structures decrease the gradient of the
mean flow and hence the growth rate of vortex shedding. The energized shear
layer is the enemy of the enemy (vortex shedding).
Blow the instability away! A not-so-subtle control is based on blowing the instabil-
ity away. The vortex shedding behind a D-shaped cylinder is successfully
reduced by Coanda forcing, that is, redirecting the flow into the near wake.
Support your friend! In some cases, naturally occurring coherent structures serve
the control goal, that is, they are your friends. Kelvin–Helmholtz vortices
over a backward-facing step reduce the recirculation zone. An amplification
of these vortices reduces the length of this zone even further.
Manipulate the sociology of coherent structures! Many aerodynamic goals
have a well-founded conceptual picture of the beneficial and detrimental
coherent structures to be augmented or mitigated. For mixing and noise of
broadband turbulence, such a picture is largely missing. Mixing and noise is
based on a longer-term integration of many different structures. This is the
classic case for model-free exploration and exploitation of control.
To illustrate these principles, let us consider following dynamical system coupling
a self-amplified amplitude limited oscillator (a1, a2 ) and a stable linear oscillator at
10-fold frequency (a3, a4 ) inspired by several wake and airfoil models (Luchtenburg
et al. 2009, Semaan et al. 2015):
da1 /dt = σa1 − a2, da3 /dt = −0.1a3 − 10a4,
da2 /dt = σa2 + a1 + b1, da4 /dt = −0.1a4 + 10a3 + b2, (17.1)
σ = 0.1 − a12 − a22 − a32 − a42 − b3 .
Without control, b1 = b2 = b3 = 0, the first oscillator converges to the limit cycle
a12 +a22 = 0.1 with unit frequency, while the second one vanishes, a3 = a4 = 0. We will
ignore the stable oscillator (a3, a4 ) unless it is excited with b2 . The first oscillator can
be stabilized with phasor control b1 = −0.4 a2 . This oscillator can also be indirectly
stabilized by exciting the second oscillator b2 = ka4 to fluctuation level above 0.1 so
that σ < 0. The constant blowing is mimicked by b3 = 0.2, leading to σ ≤ −0.1, that
is, stabilization.
Despite the powerful tools for linear model reduction and control, the assumption
of linearity is often overly restrictive for real-world flows. Turbulent fluctuations are
inherently nonlinear, and often the goal is not to stabilize an unstable fixed point but
rather to change the nature of a turbulent attractor. Moreover, it may be the case that
the control input is either a bifurcation parameter itself, or closely related to one, such
as the control surfaces on an aircraft.
The degree of nonlinearity is most easily characterized in a Galerkin modeling
framework discussed in Chapters 1 and 15. We assume a Galerkin model with the
steady solution u s as basic mode and N expansion modes ui , i = 1, . . . , N,
Na
Õ
u(x, t) ≈ û(x, t) = u s (x) + ai (t) ui (x). (17.2)
i=1
Thus, the state is approximated by the vector a = [a1, . . . , a N ]T comprising the modal
amplitudes. The Galerkin system reads
N N
dai Õ Õ
= fi (a, b) = li j a j + qi jk a j ak + gi b. (17.3)
dt j=1 j,k=1
Here, the constant term vanishes identically, because the steady Navier–stokes solution
u s corresponds to the fixed point a s = 0 in the Galerkin framework. Following the
example of Chapter 1, a single linear control input gi b is added.
In the following sections, three prototypic examples are discussed: (1) an oscillation
around the fixed point (Section 17.3.1); (2) a self-excited amplitude-limited oscillation
(Section 17.3.2); (3) frequency cross-talk with two different frequencies (Section
17.3.3); over the base-flow deformation. More details and the remaining irreducible
cases are elaborated in Brunton and Noack (2015).
Here, ui , i = 1, 2 may correspond to the real and imaginary part of the unstable
complex eigenmode or are distilled from the oscillatory data. Higher harmonics are
neglected. By construction, the stable or unstable fixed point is a s = 0.
The linearized version of the dynamics (17.3) reads
d σ 0
u
−ωu
a = A0 a + B b, where A0 = , B= , (17.5)
dt ωu σ u g
where λ = σ u ± ı ωu denotes the complex conjugated eigenvalue pair of the linear
system. Without loss of generality, the modes can be rotated so that the gain in the
first component vanishes. A similar equation holds for the measurement equation. The
matrices A0 and B depend on the fixed point a s . The linear dynamics (17.5) may also
be an acceptable approximation for turbulent flows with dominant oscillatory behavior
with small modifications (Brunton & Noack 2015)
Control design of the linear system (17.5) can be performed with one of many
control theory methods (Aström & Murray 2010). Here, we illustrate the idea of
energy-based control, which is particularly suited for nonlinear dynamics. Moreover,
The fluctuation has the same representation as in Section 17.3.1. However, the base
flow is allowed to vary by another mode u3 , called the 0th, base deformation, or shift
mode (Noack et al. 2003). This mode can be derived from the (linearized) Reynolds
equation and is assumed to be slaved to the fluctuation level. The resulting velocity
field ansatz reads
d a1 a1
= A(a3 ) + B b, A(a3 ) = A0 + a3 A3, (17.8a)
dt a2 a2
a3 = αu a12 + a22 , (17.8b)
where
σu −ωu −βu −γ u 0
A0 = , A3 = , B= . (17.9)
ωu σu γu −βu g
Without loss of generality, αu > 0. Otherwise, the sign of the mode u3 must be
changed. A nonlinear amplitude saturation requires the constant βu > 0 to be positive.
For a3 ≡ 0, (17.8) is equivalent to (17.5). However, (17.8) has a globally stable
limit cycle with radius r ∞ = σ u /α u βu in the plane a = a∞ = σ u /βu , and with
p
3 3
center 0, 0, a3∞ .
In the framework of weakly nonlinear stability theory, the growth rate is considered
a linear function of the Reynolds number σu = κ(Re − Rec ). Here, Rec corresponds to
its critical value where the steady solution with damped oscillations (σu < 0) becomes
unstable (σu > 0) in a Hopf bifurcation, giving rise to self-amplified oscillatory
fluctuations. The other parameters are considered to be constant. This yields the
famous Landau equation for the amplitude dr/dt = σ u r − βr 3 , β = αu βu and a
corresponding equation for the frequency. The Landau equation explains the famous
√
square root amplitude law r ∝ Re − Rec for supercritical Reynolds numbers
assuming a soft bifurcation. We refer to the literature for a discussion of the hard
subcritical bifurcation with quintic nonlinearity (Stuart 1971). Intriguingly, even
turbulent flows with dominant periodic coherent structures may be described by (17.8)
(Bourgeois et al. 2013).
The mean-field model and variants thereof have been successfully used for the
model-based stabilization of the cylinder wake at Re = 100 (Gerhard et al. 2003,
Tadmor et al. 2011). In these studies, an energy-based control design as discussed in
Section 17.3.1 has been used to prescribe a fixed decay rate of the model.
σu −ωu 0 0 −βuu −γ uu 0 0
ωu σu 0 0 γ uu 0 0
−βuu
where A0 = , A5 = ,
0 0 σa −ω a
0 0 −β au −γ au
0 0 ωu σu
0 0 γ au −β au
0 −βua −γ ua 0 0
0 γ ua 0 0
−βua
= , A6 = .
B
0
0 0 −β aa −γ aa
g
0 0 γ aa −β aa
1 J.-L. Aider, Private communication.
The forcing stabilizes the natural instability if and only if βua > 0. Complete
stabilization implies A11 ≤ 0. From (17.12), such complete stabilization is achieved
with a threshold fluctuation level at the forcing frequency
σu
a32 + a42 ≥ .
α a βua
Thus, increasing the forcing at higher or lower frequency can decrease the natural
frequency.
Figure 17.1 Flow visualization of the experimental wake behind a D-shaped body without (a)
and with symmetric low-frequency actuation (b), reproduced with permission from Pastoor
(2008). The D-shaped body is indicated in gray; the red squares mark the location of the
pressure sensors, and the blue arrows indicated the employed ZNMF actuators.
The generalized mean-field model has been fitted to numerical URANS simulation
data of a high-lift configuration (Luchtenburg et al. 2009) with high-frequency forcing.
This model also accurately describes the experimental turbulent wake data with a
stabilizing low-frequency forcing (Aleksic et al. 2010) as shown in Figure 17.1. The
model the generalized mean-field model as expressed by (17.10) and (17.11) may
guide in-time control (Duriez et al. 2017) and adaptive control design, providing the
minimum effective actuation energy (Luchtenburg et al. 2010). In principle, (17.11)
can be generalized for an arbitrary number of frequencies.
Model-based control has success stories in turbulence experiments for first- and
second-order dynamics as outlined in the previous section. Control-oriented models
for broadband turbulence are quite a challenge. Hence, most experimental turbulence
control approaches are model-free, albeit typically restricted to tuning one or few
actuation parameters. This section outlines the recently discovered MLC approaches,
which solve a much more general problem: the optimization of general nonlinear
multiple-input multiple-output (MIMO) control laws in an automated manner in
the full plant without a model. First (Section 17.4.1), cluster-based control (CBC)
is described allowing the rapid learning of simple smooth nonlinear control laws.
Second (Section 17.4.2), linear genetic programming control is presented, allowing
to optimize rather complex control laws with more testing.
Figure 1 of Li et al. (2018) illustrates the approach. A new MIMO control law
(17.13) is tested for a given time in a fast control loop. The typical timescale from
sensing to actuation is of the order of milliseconds in experiments. The performance
J of the control is optimized in a slow outer loop. In aerodynamic experiments, the
testing of one control law typically takes 5 to 10 seconds. After a few hundred up to
one thousand control tests, the optimization is typically converged for O(10) sensors
and up to O(10) actuators.
The optimization is based on LGP starting with the first generation of Ni = 100
random control laws Ki1 , i = 1, . . . , Ni . All these laws are tested in the experiment
Ji1 and sorted according to performance J11 ≤ J21 ≤ · · · ≤ JN 1 . Then, the N = 1
i
e
1
best control law K1 is directly copied into the new generation. The other laws are
obtained from genetic operations that preserve memory (replication), tend to breed
better laws (cross-over), and explore new minima (mutation). The new generation is
again tested and sorted, The iteration stops after a prescribed convergence criterion is
reached. Typically up to Ng = 10 generations are required. We refer to the exquisite
detailed description of LGP in Wahde (2008) and for control applications to Duriez
et al. (2017).
Figure 17.2 LGPC learning curve of jet mixing optimization. For details see Zhou et al.
(2020).
Figure 17.2 shows the learning of a distributed actuation at a nozzle exit for jet
mixing optimization. In 11 generations, the automated LGPC learning has discovered
axisymmetric forcing, helical forging, and flapping forcing before discovering a much
better performing hitherto unknown combined actuation. For details, the reader is
referred to Zhou et al. (2020).
In this section, MLC is exemplified for a dynamical system using the xMLC software.
The tutorial begins with a short description of the generalized mean-field model
(GMFM), a benchmark dynamical system of Duriez et al. (2017). Then, the download
and the installation of the xMLC software are explained. xMLC capabilities and
functions are illustrated by control design for the GMFM. Section 17.4 gives advice
for user-defined problems.
J = Ja + γJb ,
1
∫ Tmax
Ja = (a12 + a22 ) dt, (17.15)
Tmax
1
∫0Tmax
Jb = Tmax 0
b2 dt.
The penalization parameter γ is taken to be 0.01. The cost function is computed
for Tmax = 100T1 where T1 = 2π √is the period of the first oscillator. The initial
√ is set to [a1, a2, a3, a4 ] = [ 0.1, 0, 0, 0] , that is, on the unforced
condition limit cycle
T T
(radius 0.1) and the second oscillator is at the fixed point [0, 0]T . The dynamics of
the GMFM without control is illustrated in Figure 17.3.
17.5.2 Installation
In this section, the necessary steps to install the xMLC software are presented.
1 1
0 0
–1 –1
0 50T 100T 0 50T 100T
0.5 0.4
r
1
0.2
0 0
–0.2
–r 1
–0.5 –0.4
–0.5 –r 1 0 r1 0.5 –0.4 –0.2 0 0.2
Figure 17.3 Unforced dynamics of the GMFM: actuation command b (top left), growth rate σ
(top right), phase space for the unstable oscillator (bottom left), and phase space for stable
oscillator (bottom right). The last period (T1 ) is depicted in red. The circle (◦) and the dot (•)
represent, respectively, the initial condition and the final state for the oscillators. For the non
controlled system, they overlap. This figure has been plotted with MATLAB with 100 points
per period and 100 periods.
Requirements
The xMLC software has been coded in MATLAB version 9.2.0.556344 (R2017a).
Natick, Massachusetts: The MathWorks Inc., 2017 and Octave version 4.2.2 (2018).
John W. Eaton and others. Thus, any later version should be compatible with the
software. No particular MATLAB package is needed for the functioning of the
software. For Octave, the package communications is needed. It is freely available
on the official webpage https://octave.sourceforge.io. It can also be downloaded
directly from the graphic user interface with the pkg function.
Installation
The xMLC software can be download from the following link:
https://BerndNoack.com/programs/xMLC v0.9.2.tar.gz
Once downloaded, untar the xMLC v0.9.2.tar.gz file and copy the xMLC v0.9.2
folder where it is needed. Installation is then complete.
To assure the compatibility with MATLAB or Octave, some specific files must be
used. Those files are stored in the Compatibility/ folder. To change from MATLAB
to Octave and conversely, execute the adequate bash file MatlabCompat.sh or
OctaveCompat.sh in the Compatibility/ folders. If no Linux system is used, copy
the files from Compatibility/MATLAB/ or Compatibility/Octave/ to the corresponding
folders.
For further information about the content of the xMLC v0.9.2 folder please consult
the README.md file.
17.5.3 Execution
To use the xMLC software, launch a MATLAB or Octave session in the xMLC v0.9.2
folder. Then run the Initialization.m script having replaced the files for com-
patibility as outlined in Section 17.5.2. This script loads all the necessary paths and
creates a xMLC class object with the GMFM benchmark problem. This object is stored
under the variable mlc by default. To create a new xMLC class object with the default
parameters, use the following command: MyMLC = MLC;.
The following section describes some of the basic xMLC parameters and functions.
All the commands are compatible with MATLAB and Octave except the saving and
loading functions that are specific for each software.
xMLC parameters
Once the Initialization.m script is launched, a small description of the problem to
be solved is printed. It contains information about the number of inputs (controllers),
the number of outputs (sensors), the population size, and the strategy (genetic
operators probabilities). This is shown each time a new instance of a xMLC class object
is created. To show it again use the show problem method by using the command
mlc.show problem;. The mlc object has four properties:
• population: contains all the information and database references for each genera-
tion of individuals;
• parameters: where all the parameters are defined, be it for the problem, the control
law description or the xMLC parameters;
• table: is the database, it contains all the individuals explored, not all of them are
evaluated thanks to screening options;
• generation: integer referring to the actual generation. It is set to 0 by default when
the population is empty.
You can access and modify xMLC parameters with the following commands:
mlc.parameters.MutationProb = 0.5;
mlc.parameters.ReplicationProb = 0;
mlc.parameters.PopulationSize = 50;
Problem parameters and control law parameters should not be modified before
starting a run; only xMLC parameters should be considered. To change them, see
Section 17.5.4.
My first run
Once the xMLC parameters are appropriately set, the optimization process can be
launched. The evolution of the population of individuals is done with the go method.
A generation can be given as an argument to perform several evolution iterations.
With no arguments one evolution step is performed leading to the evaluation of the
generated individuals. Once the process is over, the xMLC object can be saved to be
used later. All runs are saved in the save runs/ folder with their name and with their
compatibility (MATLAB or Octave). The folders are created automatically.
% To compute the next generation and evaluate the new individuals.
mlc.go;
% To compute the next 5 generations :
mlc.go(5);
% To continue with the following 5 next generations :
mlc.go(10);
% Change the name of my run and save
mlc.parameters.Name = ’MyRun’;
mlc.save_matlab; % for MATLAB
mlc.save_octave; % for Octave
% Load my run
mlc.load_matlab(’MyRun’); % for MATLAB
mlc.load_octave(’MyRun’); % for Octave
for loops in Octave are not as optimized as in MATLAB. Hence, the evaluation of
the individuals may take much more time with Octave as compared to MATLAB. The
reader is invited to optimize the evaluation of the individuals following ones needs by
adjusting the evaluation parameters in mlc.parameters.ProblemParameters or in
the GMFM problem.m file, see Section 17.5.4.
For more information about the best control law, use the command
mlc.table.individuals(IDN); where IDN is the ID number of the best
individual. Other features can be extracted such as the learning process or the Pareto
front, thanks to the command mlc.convergence;, see Figure 17.5. Figure 17.5(a)
shows that the performance of the best control increases throughout the generations
as well as the overall performance of the population. From first to last generation
the cost function is dropped by one order of magnitude. In Figure 17.5(b) the latest
generations are the ones pushing forward the Pareto front optimizing Ja and Jb even
though the penalization parameter is only γ = 0.01.
1
0
0 –0.2
–0.4
–1 –0.6
0 50T 100T 0 50T 100T
0.5
r1 0.5
0 0
–r 1 –0.5
–0.5
–0.5 –r 1 0 r1 0.5 –0.5 0 0.5
Figure 17.4 Same as Figure 17.3 for the best control law computed after the evolution of 100
individuals through 10 generations.
(a)
(b)
10 -3
10
6
9
5
8
4 7
6
3
5
2
4
1 3
2
0
1
0 0.05 0.1 0.15 0.2 0.25
(a) (b)
Figure 17.5 Two figures obtained thanks to the convergence method for the 100 × 10
GMFM runs. Those plots show the distributions of the individuals following their performance
and cost throughout the generations. Plotted with MATLAB. (a) Cost distribution for each
generation. The color represents how each individual has been created. The red solid line
represents the best individual for each generation, and the green dashed line follows the
median performance. (b) Zoom-in on the Pareto front (black dashed line) for the 1000
individuals. Each dot represents one individual, the color codes the generation.
This chapter reviews the recent applications of deep reinforcement learning (DRL) to
the control of fluid mechanics systems. DRL algorithms attempt to solve complex
nonlinear, high-dimensionality control, and optimization problems using a direct
exploration approach, that is, trial and error. Here, we start with an in-depth review
of current state-of-the-art DRL algorithms, and we provide the background necessary
to understand how these are able to perform learning through direct close-loop
interaction with the system to control. Then, we offer an overview of the applications
of DRL to active flow control (AFC) that can be found in the literature, and we
use this review to draw concrete guidelines for the application of DRL to fluid
mechanics. Finally, we discuss both outstanding challenges and directions for further
improvement in the use of DRL methods for AFC. The chapter is supported by several
examples of DRL application codes, and offers both a theoretical and a practical guide
to new practitioners.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 369
Machine learning
Model-free RL Model-based RL
treat environment as blackbox learn model of environment
learn only control policy use model to find policy
(PID) controllers or linear quadratic regulators (LQR) rely on a local analysis of the
system and assume properties such as linearity, RL aims at performing control without
imposing requirements on the underlying system. In particular, the RL algorithms
discussed in this chapter are able to effectively map nonlinear systems through direct
exploration, and to find complex nonlinear control strategies adapted to each system.
As a consequence of its generality, applications of the RL paradigm are many,
ranging from playing a wide range of games, such as Atari, Chess, Go, or Poker
(Silver et al. 2017, 2018), to providing high levels of dexterity to robots (Hwangbo
et al. 2019), to controlling complex industrial systems in order to optimize, for
example, power consumption (Knight 2018). All these applications illustrate that
properly designed RL systems are able to perform a wide range of decision-making
and control tasks involving complex systems. This makes RL methods a promising
candidate for solving AFC problems in the context of fluid mechanics. AFC is well
known to be very difficult due to the many challenges presented by the underlying
Navier–Stokes equations, and the inherent nonlinearity and high dimensionality of
their solutions. In this chapter, AFC strategies under consideration are of a closed-
loop nature, as an observation of the state of the flow is provided to the RL agent.
This chapter provides an overview of RL applications for fluid mechanics problems.
First, we introduce basic concepts of RL and briefly describe the most common classes
of modern RL algorithms. Next, we focus on how RL can be applied to fluid mechanics
and review recent work on this topic. Finally, we discuss best practices and caveats to
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
370 Data-Driven Fluid Mechanics Chapter 18
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 371
Go is a set of positions where the next stone can be placed at each time step,
while an action in the cylinder control case could correspond to the discretized
opening of valves, that is, the mass flow rate ejected, at some actuation slots on
the surface of the cylinder.
• The reward r loosely reflects the performance of the agent and enables the agent to
learn an effective policy from trial-and-error interaction with the environment.
Importantly, the impact of a (sequence of) action(s) may not become apparent
immediately, resulting in delayed and/or sparse rewards, which is why they
cannot be used as independent supervised signal per time step. For instance, in
the previous examples, an agent may only receive a final reward at the end of
a round of Go depending on its success, and mass flow rate control decisions
have longer-term impact on stability and drag of the cylinder system. Therefore,
instead of optimizing the instantaneous reward directly, RL algorithms optimize
the cumulative reward over an episode trajectory τ, R(τ) = t=0 γ · rt ,
Í∞ t
discounted by a factor 0 ≤ γ ≤ 1 (usually γ ∈ [0.95, 0.999]). On the one hand,
the discount factor encourages shorter-term over longer-term gains while on the
other hand, it ensures that the cumulative reward converges in the case of very
long or unbounded episodes.
• The following reward-related value abstractions are used by different algorithms to
make sense of the environment and its reward structure, and to provide guidance
for the learning process:
1. the state-value V π (s) = Eτ∼π [R(τ) | s0 = s] represents the “expected value”
of being in state s, that is, the expected return obtained through a sequence of
interactions following the policy π when starting in environment state s;
2. the state-action- or Q-value2 Q π (s, a) = Eτ∼π [R(τ) | s0 = s, a0 = a] that,
similarly, captures the “expected value” of taking action a in state s, that is, the
expected value of taking action a when starting in state s and continuing to act
according to π afterward;
3. the advantage Aπ (s, a) = Q π (s, a) − V π (s) that indicates how much “better”
(in terms of expected value) it is to take action a when in state s, compared to
the expected return when randomly sampling an action from π.
RL is not a new field, and the framework represented in Figure 18.2 can be
traced back to at least the 1980s. However, the learning algorithms used for training,
and in particular the function approximators being trained, have undergone rapid
development since the popularization of deep learning. The key idea behind DRL
is to leverage the universal approximator properties of DNNs (Hornik et al. 1989),
together with effective training methods such as stochastic gradient descent and
backpropagation of errors (see Chapter 3), in order to make RL algorithms able to
cope with complex systems. In the following sections, we will introduce the two main
classes of RL algorithms – Q-learning and policy gradient methods – together with
recent examples of DRL algorithms for each class.
2 Q originally comes from “quality”, but nowadays is just known as Q-value and not used as an
abbreviation anymore.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
372 Data-Driven Fluid Mechanics Chapter 18
Figure 18.2 Generic reinforcement learning setup. Iteratively, an agent reacts to the observed
environment state with an action, which in turn influences how the environment state evolves.
Moreover, the agent receives a reward signal, which it aims to optimize over the course of
multiple training episodes, that is, rollouts of agent–environment interactions.
18.2.2 Q-Learning
Q-learning comprises a range of RL algorithms, dating back to at least the work
of Watkins (1989), though it mainly relies on the Bellman equation, which was
formulated much earlier (Bellman 1957). A method belonging to this category, Deep
Q-Network (DQN) (Mnih et al. 2015), was among the early breakthroughs of DRL.
DQN combines DNNs and Q-learning, and its success in learning to play Atari games
solely from vision and game score feedback triggered a renewal of interest in DRL.
This success is made possible by the combination of RL for learning through trial and
error, with DNNs for their ability to extract features and identify patterns in data.
Q-learning is a value-based method, meaning that the algorithm focuses on
learning a (state-action) value function, Q(s, a), which represents the overall reward
obtained by taking action a in state s. Given this Q-function, Q-learning defines the
resulting policy as π(s) = argmaxa Q(s, a). Note that this definition crucially relies
on the fact that the action space is finite, as otherwise the computation of argmax
is not straightforwardly possible. The Q-value function is optimized via temporal
difference (TD) learning in which its own estimates are used as optimization targets.
More precisely, the TD target corresponding to an experience time step st , at , rt , st+1
can be recursively estimated as Qtarget = rt + γ · maxa0 Q(st+1, a 0). Note that the value
of the next state is maxa0 Q(st+1, a 0) since the policy always picks the max action. The
Q-function is then updated by incrementally shifting its predictions in the direction of
the TD residual Qtarget − Q, with the update step size being controlled by a learning
rate η > 0 (n indicates the current optimization step):
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 373
Figure 18.3 Illustration of the learning process of an RL agent in a grid-world with one
positive-reward goal state G and two negative-reward failure states T. Starting with no prior
knowledge, the agent learns to estimate the value of states more accurately over the course of
training solely via interaction. Initially, the negative reward of the failure states propagates to
neighboring states, so the agent learns to avoid moving there. As a consequence, it
subsequently comes across the goal state whose positive reward propagates back to the starting
point over time, yielding a first successful strategy. Ultimately, the agent learns to trade off
avoiding failure states with quicker strategies of getting to the goal state.
between Qtarget and the Q θ to be optimized, we arrive at the equivalent update rule
for DNN-parametrized Q-functions (note that Qtarget is fixed with respect to the
differentiation):
target
θ n+1 = θ n + η · ∇θ L Q θn , Q θ (st , at )
2
= θ n + η · ∇θ 0.5 · rt + γ · max Q , a 0
Q , a
θ n (s t+1 ) − θ (s t t )
a0
= θ n + η · rt + γ · max Q θn (st+1, a 0) − ∇θ Q θ (st , at ) . (18.2)
0 a
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
374 Data-Driven Fluid Mechanics Chapter 18
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 375
∫
∇θ Eτ∼πθ [R(τ)] = R(τ) · ∇θ πθ (τ) dτ
∫
= R(τ) · πθ (τ) · ∇θ log πθ (τ) dτ
Using the conditional (in-)dependence between time steps, the fact that the
logarithm of the product probability πθ (τ) can be turned into a sum, and the fact that
the transition probabilities of the environment do not depend on the policy, a simple
Monte Carlo gradient estimator ĝ of this expectation can be derived as:
n
1 Õ
ĝ = R(τi ) · ∇θ log πθ (τi )
·
n i=0
n
"∞ #!
1 Õ Ö
(i) (i) (i) (i) (i)
= · R(τi ) · ∇θ log πθ (at |st ) · W(st+1 |st , at )
n i=0 t=0
n
"∞ ∞
#!
1 Õ Õ
(i) (i)
Õ
(i) (i) (i)
= · R(τi ) · ∇θ log πθ (at |st ) + log W(st+1 |st , at )
n i=0 t=0 t=0
n ∞
!
1 Õ Õ
(i) (i)
= · R(τi ) · ∇θ log πθ (at | st ) , (18.4)
n i=0 t=0
where W(st+1 |st , at ) is the probability that, starting from state st and taking action
at , the environment evolves into state st+1 . To sum it up, policy gradient methods
work by sampling a set of trajectories at each training step, and use the collected
on-policy data to shift the weights of the policy network in order to maximize its
expected return, that is, to increase the likelihood of trajectories with high return and
discourage decisions that have led to low episode returns. Note that, in contrast to Q-
learning, exploration is intrinsic to policy gradient methods since the latter is based on
a stochastic policy. Consequently, balancing exploration and exploitation is part of the
training objective, which means that only after repeated positive feedback the action
distribution narrows to the point where it is virtually deterministic. When deploying
the trained algorithm, sampling is turned off and replaced by deterministically picking
the maximum likelihood action.
The gradient estimator in (18.4) is also known as the REINFORCE algorithm
(Williams 1992). However, it is a high-variance estimate since it scales all time steps
of an episode by the same episode return R(τ). This in turn may impact convergence
speed of the gradient descent optimization. Lower-variance estimators, which instead
weight time steps according to the “reward to go”, that is, the future return Rt (τ) =
R(τ[t, t + 1, . . . ]) following the current timestep t can be derived similarly. Variance
can be further reduced by observing that a state-dependent baseline function b(s) can
be subtracted from the return without affecting the expectation value in (18.3). It can
be shown that b(s) ≈ V π (s) is the optimal choice with respect to variance reduction
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
376 Data-Driven Fluid Mechanics Chapter 18
Algorithms that focus on learning a baseline value function in addition to the actual
policy are also called actor-critic methods. In this case, the value function is typically
a second “critic” DNN besides the first “actor” policy DNN, trained for a regression
objective similarly to Q-learning in (18.2). The purpose of the critic is to modify the
return R(τ) in the original policy gradient equation (18.3) in order to improve algo-
rithm convergence. This can be performed in several ways, for example, by subtracting
a baseline as is done through using Rt (τ) − b(st ) in (18.5), or by replacing the return
altogether by the value output of a network trained through Q-learning, V(st ) ≈ Rt (τ).
A range of extensions and modifications of this approach to policy optimization
have been explored. ACER (Wang et al. 2017) is an off-policy actor-critic algorithm
that uses experience replay similar to DQN (see Section 18.2.2). Soft actor-critic
(SAC) (Haarnoja et al. 2018) incorporates an entropy-based loss component into the
policy optimization to encourage exploration. Deterministic policy gradient (DPG)
(Silver et al. 2014) derives a deterministic variant of the policy gradient theorem.
Trust-region policy optimization (TRPO) (Schulman et al. 2015) is a related class of
algorithms that additionally constrain the update steps to a trust region, preventing
the updated policy distribution from diverging too far from the old policy, thus
limiting the confidence in each single, potentially detrimental update. Proximal policy
optimization (PPO) (Schulman et al. 2017) is inspired by such trust-region methods
and loosely enforces conservative updates by clipping the objective accordingly.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 377
(Espeholt et al. 2018) pushed the number to around 200, and the evolution-strategy-
based system of Salimans et al. (2017) leveraged more than 1 000 cores in parallel.
Such a parallelization approach is particularly interesting for the application of DRL
to flow control because of its time-consuming simulations, as illustrated by Rabault
and Kuhnle (2019).
Model-Based Influences
The successes of DRL have mostly used model-free RL algorithms, which attempt
to solve a task solely based on interaction and reward. However, trying to understand
how the environment works can provide an additional learning signal that ultimately
also benefits the agent’s task-solving ability, and there has been work to augment
model-free approaches with model-based features. For instance, Jaderberg et al.
(2017) used auxiliary tasks such as next-frame prediction to encourage the network
to learn representations that capture basic world dynamics. More recently, Ha and
Schmidhuber (2018) presented an approach in which a full world model is learned
from (random) interaction data and subsequently used by the agent for training, instead
of interacting with the real environment.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
378 Data-Driven Fluid Mechanics Chapter 18
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 379
and action spaces associated with multiple 3D-simulated fishes swimming together as
a group. In these studies, DRL is able to discover strategies that provide clear energy
benefits compared to fishes swimming individually, and complex strategies are found
that allow follower fishes to effectively extract energy from the vortex shedding alley
behind leading fishes. Such results open the way to experimental understanding of
fish schoolings and associated energy gains, as well as to applications in robotics. In a
similar fashion, applications to gliding have been proposed (Novati et al. 2019). This
category of applications can be seen as an effective way of finding optimal locomotion
strategies for active swimmers or gliders, in a direct analogy to the use of DRL for
legged robots locomotion.
These examples illustrate how DRL is used as a tool to experimentally study
complex phenomena where a purely theoretical approach faces challenges, since
describing a school of swimming fish is analytically intractable, and other experi-
mental or numerical approaches do not cope well with the high dimensionality and
analytical intractability of the problem.
2 The full code is released open-source, and the results can be reproduced following the instructions
available at https://github.com/jerabaul29/Cylinder2DFlowControlDRL.
3 The code is available here: https://github.com/jerabaul29/Cylinder2DFlowControlDRL
4 The code is available here:
https://github.com/jerabaul29/Cylinder2DFlowControlDRLParallel
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
380 Data-Driven Fluid Mechanics Chapter 18
Figure 18.4 Illustration of active flow control of the von Karman alley by a deep
reinforcement learning Agent (PPO algorithm). This figure is a slightly edited reproduction of
the results presented in Rabault et al. (2019), reproduced with permission. The top panel shows
a representative snapshot of the 2D flow around a cylinder, in a setting similar to the
benchmark of Schäfer et al. (1996). The bottom panel shows the typical flow field obtained
when a trained DRL agent is used to control the vortex shedding to reduce drag. The black
dots indicate the location of a series of pressure probes, which form the state observation. The
red dots at the top and bottom of the cylinder indicate the position of two small synthetic jets,
whose mass flow rates are decided by the actions of the DRL agent. Here, the reward function
encourages to minimize drag while keeping lift fluctuation low. A drag reduction of about 8%
is obtained, close to the optimal drag reduction value predicted by the symmetric baseline flow
analysis of Bergmann et al. (2005).
the resulting strategy to the exact choice of reward function, and more generally the
importance of using physical insights when defining the reward.
Having established the potential of DRL for AFC, several questions remained to
enable the application of DRL to more realistic and sophisticated systems. First, DRL
needs to be able to cope with the large action space dimensionality of real-world
systems. This is challenging since the size of the control space to explore is much
larger when the number of actuation signals is increased (typically, the naive direct
exploration cost is a power function of the action space dimensionality, as it scales
with the combinatorial cost of exploring the effect of each combination of actuations).
Fortunately, most realistic systems exhibit invariants of some form, which can be
exploited to reuse the same policy across different parts of the system and, conse-
quently, reduce exploration and learning costs. This was demonstrated empirically
through the analysis of thin fluid film instability control (Belus et al. 2019),5 and
will be discussed in more details in Section 18.4.3, where we provide guidelines for
practitioners.
Second, robustness of the final control strategy is critical for real-world applications
employing DRL. Here again, recent results have been promising. Several robust
applications of DRL, even in complex environments, have been presented by the
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 381
Figure 18.5 Demonstration of the robust closed-loop AFC strategy found by a PPO agent, as
described in the work of Tang et al. (2020) (the figure is a slightly edited version of the results
presented there, used with the permission of AIP Publishing). A single PPO agent is trained on
simulations similar to the ones presented in Figure 18.4. However, in the present case, the
Reynolds number used for each episode is chosen at random, in the discrete set {100, 200,
300, 400}. After training, the PPO agent is found both to exhibit highly effective control
(comparable with the optimal control expected from the symmetric flow analysis suggested in
Bergmann et al. (2005)), and to be robust enough to control the flow at any Reynolds number
in the range [60, 400], including values that had not been seen during training. This illustrates
both the robustness and generalization ability of AFC strategies discovered through DRL.
ML community, for instance, for playing games or for robotic control (Hwangbo
et al. 2019). In the fluid mechanics community, the first investigation of the ability of
DRL to produce robust control strategies was presented by Tang et al. (2020).6 In this
work, the DRL agent was trained on a range of Reynolds numbers rather than a single
fixed number, as was done in previous studies. This means that the agent is trained on
a range of flow conditions that, while broadly exhibiting the same patterns, still differ
in the precise behavior and responsiveness of the wake. Experiments showed that the
DRL agent is able to learn a robust global control strategy, and to perform control over
a wide range of Reynolds numbers, including ones not seen during training. The main
findings of this work are visualized in Figure 18.5. Such findings are very promising
for applications in the real world, where one would expect a similar variation of flow
properties.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
382 Data-Driven Fluid Mechanics Chapter 18
The examples presented so far still work with relatively low Reynolds numbers, that
is, moderate nonlinearity and complexity, owing to the computational cost associated
with increasing the number. More critically, all the AFC examples have been in
the laminar (though, chaotic) regime. Therefore, a final milestone to achieve in
order to truly qualify DRL as a possible AFC method for realistic applications is to
show successful application in turbulent flow conditions. An important first step in
this direction was recently achieved in 2D simulations of the Karman wake behind
a cylinder, at an intermediate Reynolds number of 1000 (Ren et al. 2020). The
setup is very similar to the one introduced by (Rabault et al. 2019), except for the
flow regime that is chaotic due to a much higher Reynolds number. The drag in
these conditions can be reduced by up to around 30% on average, and a transient
drag reduction of up to around 50% can be observed, which clearly demonstrates
the potential of DRL control even in the case of more complex, higher Reynolds
number flows.
In addition to the application of DRL for controlling instabilities in the wake
of a cylinder or at the surface of thin water films, at least three other applications
were presented recently. The first is concerned with the control of the Kuramoto–
Sivashinsky (KS) equation (Bucci et al. 2019), which is an example of a system
exhibiting spatiotemporal chaos. The authors find that a simple Q-learning method is
able to capture the dynamics of the system and control its behavior with only limited
actuation. The second example is focused on controlling spatio-temporal structures
happening during Rayleigh–Benard convection (Beintema et al. 2020). Again, the
authors find that DRL is able to drastically reduce the development of instabilities
in the system by controlling the distribution of heating power at the lower side of
a convection cell. The strategy obtained through DRL is compared with classical
control algorithms and found to significantly outperform them. Finally, Xu et al.
(2020) studied the control of the wake behind a cylinder using small contra-rotative
cylinders.
The work of Beintema et al. (2020) clearly illustrates the limitations of traditional
“optimal control” algorithms. While these methods are very attractive both from a
theoretical and mathematical point of view, in practice they rely on strong assumptions
about the underlying system for both convergence and stability. These assumptions
– for instance, some form of Lipschitz-continuity, convexity, or the existence of
extrema/poles – are often not satisfied by real-world, nonlinear, high-dimensional
systems. In such cases, DRL, which is based on learning to control through trial
and error and leverages the approximation capabilities of deep learning, is not only
more robust but it may be the only feasible approach to control systems that exhibit
strong nonlinearities. Indeed, if a system is highly nonlinear, any method based on lin-
earization or local-order reduction techniques will always have a very limited domain
in the phase space within which it is valid. Consequently, one needs to explore the
space and exploit the gathered knowledge, which is precisely the learning process for-
malized by the DRL framework, while relying on DNNs as general-purpose function
approximators.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 383
In this section, we present a range of technical points to keep in mind when designing
a new application of DRL in fluid mechanics. This section is a synthesis of beneficial
practices found in papers referred to in the Sections 18.2 and 18.3. While the
discussion is focused on fluid mechanics problems, the techniques discussed here are
generally applicable to many DRL applications.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
384 Data-Driven Fluid Mechanics Chapter 18
The flow of interaction between the agent and the environment closely follows that
of Figure 18.2. First, the environment is prepared for a new episode of interaction with
the agent and returns the initial state (reset()). Starting from here, alternately, the
agent reacts to the observed environment state, and the environment is advanced by
one time step based on the agent’s decision (execute(...)). In all this, the state
and action space specify the communication format, that is, what observations of
the environment look like (for instance, image size or number of sensor values) and
what type of actions the environment expects in return from the agent (for instance,
discrete or continuous decisions). Finally, the interaction either terminates once a
failure or success state is reached, or is aborted after a predefined number of time
steps, particularly if there are no natural terminal states (as is often the case for flow
control problems). Subsequently, a new training episode is initiated. This is illustrated
by the following code snippet:
# Train for 200 episodes
for _ in range (200) :
# Initialize environment
states = environment . reset ()
terminal = False
While it may seem more flexible to reimplement a DRL algorithm from scratch
instead of having to interface with an existing framework, we discourage fluid
mechanics practitioners, particularly if they are new to DRL, from doing so, and highly
recommend relying on well-established open-source implementations such as Tensor-
force, Stable-Baselines or RLlib instead. The main reason is that it is a challenging
task to make sure that a DRL implementation actually works correctly, and debugging
it if it does not. This is due to a combination of factors, including their intrinsic non-
determinism on various levels (action sampling, random weight initialization, etc) and
the fact that internal representations are non-interpretable. Moreover, one frequently
finds that DRL algorithms learn to “compensate” for implementation errors, meaning
that the agent may be able to optimize the reward function reasonably well even in the
context of critical bugs since, to the learning algorithm, an erroneous implementation
is just yet another complex optimization problem to solve.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 385
First, one needs to make sure that the choice of the state input to the agent is
consistent with the problem to solve, and provides a sufficient amount of information
for the agent to identify potential causality relations between the state of the system,
the action taken, the evolution in time of the system, and the reward observed. At this
stage, physical understanding of the controlled system is very beneficial. In addition,
the general setup of the control must take advantage of the structure of the problem in
order to perform learning effectively. In particular, properties of invariance by transla-
tion, rotation, or any other form of symmetry should be exploited. Rabault et al. (2019)
and Belus et al. (2019) investigated several possible methodologies for exploiting sys-
tem invariants when designing DRL applications. Using a simple example application
(controlling the development of instabilities in a thin fluid film over an extensive
region in space using multiple jets), they found that failure to take into account
invariants can lead to prohibitive exploration costs which make learning impossible
when several control signals are needed. In contrast, successful exploitation of these
invariants made it possible to control an arbitrary number of actuators. The best DRL
deployment technique, shown in Figure 18.6, relies on applying the same DRL agent
to control different parts of the system, which are modeled as separate environments
but reuse the same agent DNN weights – and thus the distilled knowledge about the
system. The same approach can be adopted to enforce other 2D, 3D, or axis-symmetric
policy invariants, depending on the problem under consideration. A challenge with
this approach is to formulate a suitable reward function. The reward function of each
environment should, on the one hand, be as “local” as possible, so that each agent
instance gets direct feedback on the consequence of its actions, while on the other
hand, the overall optimization objective is usually a global problem that does not have
a simple “local” solution. One approach to address this dilemma is to formulate the
reward function as a weighted sum of two components, one limited to the local effects
of each agent instance and the other reflecting the global state of the system.
Second, the choice, shaping, and physical meaning of the reward function is critical
for obtaining efficient control strategies. This is well known both in RL where it
is often referred to as “reward shaping” (Ng et al. 1999), as well as in a variety of
applications within mechanics (Allaire & Schoenauer 2007). Indeed, a bad (or naive)
choice of reward function may lead to unexpected and detrimental behavior despite the
fact that an agent manages to achieve consistently high rewards. An example of this
problem is described in Rabault et al. (2019) and Rabault and Kuhnle (2019): when
using DRL to control the wake of a cylinder with the aim of reducing drag, a reward
function focusing purely on drag reduction leads to degenerate strategies with a strong
biased blowing. While this effectively reduces drag, it also creates a large lift bias. To
prevent this problem, one has to modify the reward function so as to both encourage
drag-reducing strategies while simultaneously discourage the use of detrimental tricks
to get there. As is the case with specifying good state and action spaces, designing
appropriate reward functions benefits greatly from engineering knowledge about the
system to control.
Another important point to consider when deploying DRL in fluid mechanics
applications is the parallelization of learning. Since most fluid mechanics simulations
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
386 Data-Driven Fluid Mechanics Chapter 18
Figure 18.6 Illustration of the environment-splitting method for taking advantage of physical
system invariants when performing AFC with DRL. The figures are reproduced from Belus
et al. (2019). Left: the general setup of the environment-splitting method. The neighborhood of
each actuator constitutes an individual environment, and all the environments are controlled by
clones of the same DRL agent. This way, invariance of the policy is enforced. Right: applying
this method translates in a training cost that is independent of the number of jets. This is in
stark contrast to methods that do not take advantage of these invariance properties. Indeed, as
described in Belus et al. (2019), failing to represent invariants inside the structure of the DRL
problem formulation leads to prohibitive exploration cost as the control space dimension
increases, and training fails completely in a setting without such invariant setup when as little
as five jets are involved.
are expensive, they are frequently the time bottleneck in the overall training process.
Fortunately, the DRL side of the training process can be parallelized simply by running
several copies of the environment simulation (e.g., on different cores or machines) and
enabling the agent’s learning algorithm to incorporate data from parallel experience
streams. Consequently, one can leverage parallelized CFD simulations – plus, where
applicable, multiple experience streams per simulation when implementing invariant
control, as described earlier – to effectively sample a large amount of training data,
given enough computational resources.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 387
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
388 Data-Driven Fluid Mechanics Chapter 18
3.3
3.2
CD
3.1
3.0
fa / fs = 200 5
10 1
2.9
0 50 100 150 200
total nbr of episodes total CPU time
Figure 18.7 Illustration of the importance of choosing consistent timescales across the DRL
setup. The figure is reproduced with minor editing from Rabault and Kuhnle (2019), with the
permission of AIP Publishing. Depending on the choice of action frequency fa relative to the
typical timescale of the underlying system fs , the learning curves of a PPO agent attempting to
reduce drag show substantial variation. On the one hand, if the action frequency fa is too low
( fa / fs = 1), the DRL agents cannot control the system as it is unable to react fast enough to
affect the vortex shedding. On the other hand, if the action frequency fa is too high
( fa / fs = 100), learning is not successful either, since the agent’s random behavior at the
beginning of training simply adds high-frequency white noise to the system, which is
effectively smoothed out at the larger timescale of vortex shedding. However, between these
two extremes, there is a range of fa / fs ratios for which the agent successfully learns to control
the system.
the interpolation function may be important for the quality of control, as discussed in
Tang et al. (2020).
Another technique that can improve learning quality and stability is transfer
learning, as illustrated by recent work. First, Ren et al. (2020) initially trained a DRL
agent in a situation of reduced complexity – in this case, a flow at moderate Reynolds
number – before moving to the full problem complexity, a flow at a higher Reynolds
number. This approach stabilized learning and yielded better final results. Second,
Tang et al. (2020) trained a DRL agent to control a range of Reynolds numbers at once,
where the Reynolds number was randomly chosen at the beginning of each episode.
Here again, the learning process improved and the final strategies were more efficient
than comparable strategies trained to control flow just at a single Reynolds number.
This indicates that training agents on a broader range of similar control problems and
environment conditions help to find better and more robust strategies.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Deep Reinforcement Learning and Active Flow Control 389
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
390 Data-Driven Fluid Mechanics Chapter 18
approach (Johannink et al. 2019), meaning that DRL learns how best to modify (within
a safe margin) the control given by the underlying system for further gains.
As a final thought, we want to emphasize the importance of reproducibility of
experimental results obtained by applying DRL, in particular with respect to open-
sourcing code and providing containerized environments. While it may be sufficient to
describe the algorithm and theoretical foundation in the case of traditional approaches,
the control strategies found by DRL cannot be described in the same way, since
they are encoded in the precise weights of the neural network. Consequently, it is
important to publish environment and training code together with the trained DRL
agent to be able to reproduce, analyze, and leverage the reported results. Moreover,
it is well known in the DRL community that reproducibility is an important problem
(Henderson et al. 2018), and that even seemingly irrelevant technical implementation
choices can play a significant role for the quality of the learned behavior (Engstrom
et al. 2020). We hope that the fluid mechanics community will follow the example
set by the ML community in this aspect, and will increasingly adopt the policy of
releasing code open source and enabling reproducibility by providing containerized
environments.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Part VI
Perspectives
19.1 Introduction
We discussed in Chapter 2 the different meanings of the word “Science,” and how they
are related to the scientific method and to the idea of causality. We also mentioned
two approaches to determining causes: the observational one, which uses correlations
between past and future events to deduce rules that describe the future behavior
of the system, and the interventional one, in which the researcher monitors the
consequences of changing the initial conditions in the (often aspirational) hope of
eventually controlling it.
Most of the discussion in Chapter 2 was observational, because experiments in our
chosen field of turbulence are expensive. In this chapter, we explore the interventional
determination of causality in physical systems, and discuss whether recent technolog-
ical developments provide us with new ways of doing it. To fix ideas, we illustrate our
argument with the study of a particular flow, two-dimensional decaying homogeneous
turbulence, about which the general feeling is that most things are understood. As
such, we will be able to test our conclusions against accepted wisdom, although this
This work was supported by the European Research Council under the Coturb grant
ERC-2014.AdG-669505.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
394 Data-Driven Fluid Mechanics Chapter 19
may lead some readers to think that the discussion is uninteresting, because nothing
new is likely to be discovered. We will see that this is not quite true. Some new
aspects can still be unearthed, or at least reemphasized, and the method for doing
so is interesting because it would have been impractical a decade ago.
We mentioned in Chapter 2 that fluid mechanics has only recently accumulated
enough data to even consider a data-driven approach, but the same has not been true
in other fields. Most medieval science was data-driven. Kepler’s laws (1618) were the
empirical results of observations, with no plausible explanation until the publication
of Newton’s Principia, 70 years later (Newton 1687). Engineering and medicine have
always been at least partly empirical and data-driven, and biochemistry has probably
been the first subject to reach the data-driven level in modern times, made possible
by the fast methods of DNA synthesis that became available in the 1980s. Much of
the modern discussion on raw empiricism relates to biochemistry (Voit 2019), and
so did the first article claiming to describe a “robot scientist” (King et al. 2009). In
fluid mechanics, the availability of enough data was made possible by the numerical
simulations of the 1990s, which appeared to promise that any question that could be
posed to a computer would eventually be answered (Brenner et al. 2019, Jiménez
2020a), but this promise was tempered by the limitation that the questions first had to
be put to the computer by a researcher.
In this chapter, we discuss to what extent this last roadblock can be removed. The
new enabling technology is the increased speed and memory of computers, which can
now do in minutes what used to take days a decade ago. We will argue that this enables
a new way of asking questions, not based on a plausible hypothesis, but randomly, in
the hope that some of them might turn out to be interesting. This procedure does not
avoid the necessity of answering the questions, although this can presumably be done
quickly by the computer, nor of evaluating how “interesting” the answers are, which
computers most probably cannot yet do. This “Monte Carlo” procedure does not point
towards a future of “human-less” research, but to a new level of partnership between
humans and computers. Similar steps have been taken before: we no longer dig canals
or throw spears by hand, except as a sport, nor do we integrate ordinary differential
equations using special functions. Most of these human–machine symbioses are
usually considered beneficial, although many created their own disruptions when they
were introduced. There is no reason to believe that this time will be different, but it is
fair to question what the new advantages and difficulties will be.
Relinquishing control of the questions to be asked is not an altogether new
experience to anybody who has trained graduate students or mentored postdocs, not
to speak of managers of large industrial or academic research groups. Any such
person knows the feeling that, at some point, the research is no longer theirs, although
most of us console ourselves by arguing that we are training our peers and that, in
the end, the overall direction is set by us. Monte Carlo science lacks both of these
(probably spurious) consolations. This may be its main advantage. It is inevitable
that our students or subordinates share some of our ideas and, most probably, some
of our prejudices. It is often argued that researcher prejudice is the main roadblock
to qualitative scientific advances, and that “paradigm shifts,” always hard to come
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 395
by, are delayed because of it (Kuhn 1970). The main advantage to be gained from a
Monte Carlo questioning algorithm is probably its lack of prejudice. If we can avoid
transferring our biases to it, such an algorithm would act as an unprejudiced, although
probably not yet very smart, scientific assistant.
The rest of this chapter is structured as follows. Section 19.2 describes how the
Monte Carlo ideas just described can be applied to the particular problem of two-
dimensional turbulence, and Section 19.3 briefly discusses the physics that can be
learned from them. Section 19.4 closes with a discussion of the connections that can
be established between the scientific method and modern data analysis or artificial
intelligence, in the light of the experience of Sections 19.2 and 19.3.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
396 Data-Driven Fluid Mechanics Chapter 19
A ω ⇒ −ω
B u ⇒ −u
Table 19.1 Initial perturbations for the two experiments discussed in the text. In both
cases, the mean velocity and vorticity are zeroed over the full computational box after
the perturbation is applied. An extra “pressure” step is applied to enforce continuity
after modifying u in case B, and may substantially modify the intended perturbation.
atmosphere in which simulations tell us that a hurricane is likely to form after a certain
time, and assuming that we have no limitations on what can be done, what should we
do? Target existing storms? Target the jet stream?
Thus posed, the problem is to clarify causality, and our method is interventional:
that is, we change the initial conditions and classify the results (see the discussion
in the Section 2.1). A number (Nex p ) of experimental flow fields (“flows” from
now on) are prepared, and each of them is perturbed to create a variety of initial
conditions (“experiments” from now on). Each experiment is run for a prescribed
time, T, of the order of ω00 T ≈ 10 turnovers, where ω00 = hω2 i 1/2 is the root-
mean-square (r.m.s.) vorticity of the initial unperturbed flow, and the time-dependent
average h·i is taken over the full computational box. As the evolution of the perturbed
flow diverges from that of the unperturbed initial condition, the magnitude of the
evolving perturbation is defined as the norm, (t), of the difference between the
perturbed and unperturbed flow fields, evaluated over the entire computational box.
The experiments for which this magnitude is largest at some predetermined time,
Tref , are defined as most “significant.” For each experiment and test time, the nkeep
perturbations with the largest and smallest deviations (Tref ) are classified as most
or least significant, respectively, because, in common with many complex systems, it
is empirically found that the first few most- and least-significant experiments result
in fairly similar perturbation intensities (LeCun et al. 2015). Because significance is
defined as the effect on the flow at some future time, the most significant experiments
are also defined as being most “causally important” for the flow evolution. In our study,
the initial perturbation is applied by dividing each flow into a regular grid of Nc × Nc
square cells, each of which is in turn modified in a number of different ways. The two
experiments discussed here are listed in Table 19.1; several others are presented in
Jiménez (2020b). In most cases, the results are averaged over Nex p = 768 flows, with
nkeep = 5, on a grid with Nc = 10.
Simulations are performed in a doubly periodic square box of side L = 2π, using a
standard spectral Fourier code dealiased by the 2/3 rule. Time advance is third-order
Runge–Kutta. The flow is defined by its velocity field u = (u, v) as a function of the
spatial coordinates x = (x, y), and time. The scalar vorticity is ω = ∂x v − ∂y u, and
the rate-of-strain tensor is si j = (∂i u j + ∂j ui )/2, where the subindices of the partial
derivatives range over (x, y), and those of the velocity components over (u, v). The
rate-of-strain magnitude is S 2 = 2si j si j , where repeated indices imply summation.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 397
Time and velocity are respectively scaled with ω00 , and with q00 = (u 02 + v 02 )1/2 , both
measured at the unperturbed initial time, t = 0. All the cases discussed here have
Fourier resolution 2562 , with Re = q00 L/ν = 2500, where ν is the kinematic viscosity.
Further details can be found in Jiménez (2018b, 2020b).
Figure 19.1(a,b) shows the typical initial vorticity and kinetic energy fields, with
the 10 × 10 grid overlaid, and Figure 19.1(c) shows the corresponding enstrophy
and energy spectra. This figure also includes the transfer function for a window
corresponding to the isolation of an individual cell. Because such a window acts as a
multiplicative factor in physical space, it acts as a convolution kernel for the spectra,
smoothing details narrower in wavenumber (or wider in wavelength) than the width
of the transfer function. The cell outlined in red in Figure 19.1(a,b) is one of the most
significant ones, and the one outlined in black is one of the less significant. Both are
modified as in case A in Table 19.1 (i.e., inverting the vorticity in the cell). These two
cells have been chosen so that they have similar initial perturbation intensities, but
different intensities at the classification time, ω00 Tref = 4.5. The evolution in physical
space of their perturbations is shown in Figure 19.1(d,e). It is clear that the more
significant perturbation rearranges the flow in its vicinity, and its effect eventually
spreads to the full field, while the less significant one stays localized and soon decays.
The geometric size of the perturbations can be estimated by the ratio between
their integral quadratic and point-wise maximum. It is shown in Jiménez (2020b)
that all perturbations start with sizes of the order of the cell size, Lc , and either
decay or spread to the size of the computational box, 10Lc , over times of the order
of ω00 t ≈ 5 turnovers. This can be taken as the time over which perturbations lose their
individuality, and over which causality can be usefully studied.
The purpose of our study is to determine which properties of the unperturbed
initial cells best correlate with the effect of modifying them. Several factors are
important, such as the modification method (Table 19.1), the norm used to measure
the perturbation intensity, and the time Tref at which the classification is done. The
following analysis uses the kinetic energy norm k uk2 as a measure of intensity,
and a reference time ω00 Tref = 4.5 for the classification, but similar experiments
were repeated using k uk∞ , kωk2 , and kωk∞ , with little difference in the results.
The choice of Tref is discussed in Jiménez (2020b), and its effect on the results
in Section 19.2.2. The cell properties used as diagnostic variables are the average
cell enstrophy, ωc2 = hω2 ic , defined over individual cells, the averaged kinetic
energy, qc2 = hu2 + v 2 ic , and the average magnitude of the rate-of-strain tensor, Sc2 .
Jiménez (2018b, 2020a) diagnosed significance using an optimal threshold computed
for each of these variables in isolation. Each modification of an individual cell was
classified as significant or not according to the flow behavior at Tref , and the resulting
labeled set was used to train the threshold in such a way that the number of mis-
classified cells was minimized. Jiménez (2020b) and the present notes use a version
of the same idea with the capacity of multivariable classification: support vector
machines (Cristianini & Shawe-Taylor 2000), implemented by the fitcsvm MAT-
LAB routine, which determines a separating hyperplane instead of a scalar threshold
(see Figure 19.2).
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
398 Data-Driven Fluid Mechanics Chapter 19
0.4
0.2
0
101 102
(d)
(e)
Figure 19.1 (a) Initial vorticity field, ω/ω00 , used for the evolutions in (d,e). Case A in Table
19.1. The cells outlined in black (less significant) and in red (more significant) have relatively
similar initial perturbation intensities, but very different later evolutions. (b) As in (a), for the
velocity magnitude, |u|/q00 . (c) Premultiplied spectra of the initial flow fields used in the
experiments, in terms of the wavenumber magnitude, k. , Enstrophy spectrum; ,
energy; , transfer functions of a box window corresponding to the cell size of the
experiments, vertically scaled to fit the plot. ◦, Nc = 4; 4, Nc = 6; 5, Nc = 10. (d) Evolution of
the perturbation vorticity for the less significant cell, marked in black in (a,b) 19.1. From left to
right: ω00 t = 0, 1.3, 2.6, 4.5. (e) As in (d) for the more significant cell.
The efficiency of the classifier, defined as the fraction of correctly classified events,
is collected in the tables in Figure 19.3 for various experimental perturbations and
combinations of diagnostic variables. It is clear that some linear combination of
enstrophy and kinetic energy is always able to separate the data almost perfectly,
but that Sc rarely helps. The best combination of vorticity and velocity used by the
classifier depends on the initial perturbation method, as shown in Figure 19.2. In
general, the enstrophy is the best diagnostic variable for perturbations that manipulate
the vorticity (case A in Figure 19.2(a)), while the kinetic energy is the dominant vari-
able for the cases that manipulate the velocity (case B in Figure 19.2(b–d)). The fifth
column in Figure 19.3 shows that some improvement can be achieved by including
contributions from both the enstrophy and the energy, but that the effect is marginal.
The classification efficiency depends on the size of the experimental cells. In gen-
eral, the efficiency degrades for larger cells, as shown in Figure 19.3 for Nc = 4 − 10.
The details of the degradation in case B can be followed by the increasing overlap
of the joint p.d.f.s in Figure 19.2(b–d). This dependence of the effectiveness of the
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 399
2 2 2 2
1 1 1 1
0 0 0 0
0 2 4 0 2 4 0 2 4 0 2 4
Figure 19.2 Optimum classification lines for different initial perturbations, in terms of the
kinetic energy and of the enstrophy. (a) Case A, Nc = 10 (b) Case B, Nc = 10. (c) Case B,
Nc = 6. (d) Case B, Nc = 4. In all cases, 768 flows and nkeep = 5. The contour lines contain
50%, 70%, and 90% of the joint probability density functions (p.d.f.) of the diagnostic
variables for: , most significant cases; , least significant cases.
1
Case A 1.00 0.64 0.72 1.00 1.00 0.72
0.8 Nc = 10
B 0.69 0.65 1.00 0.70 1.00 1.00 0.6
1
Case A 0.89 0.58 0.72 0.92 0.89 0.72
0.8 Nc = 6
B 0.68 0.63 0.98 0.68 0.98 0.98 0.6
1
Case A 0.72 0.56 0.72 0.77 0.76 0.72
0.8 Nc = 4
B 0.64 0.60 0.89 0.64 0.90 0.89 0.6
c
Sc qc c
–Sc c
–qc Sc–qc
Figure 19.3 Efficiency of support vector machine classifiers. Unit efficiency is perfect
classification, and 0.5 is random choice. Rows are the cases in Table 19.1, and columns are the
different combinations of diagnostic variables.
enstrophy is consistent with the spectrum in Figure 19.1(c), which shows that the
cell size when Nc = 10 preserves details of the spectrum of the order of the vortex
size, but it is little surprising that small cells work so well for the kinetic energy,
whose spectrum peaks at the scale of the box, suggesting that, at least at this low
Reynolds number, the causality of the kinetic energy remains concentrated at the
scale of individual vortex pairs. In fact, the effectiveness of the enstrophy and of the
energy behave differently with the cell size. While the mean effectiveness of ωc2 as a
diagnostic variable in case A decays from 1 to 0.72 as Nc = 10 → 4, that of qc2 in case
B only decays from 1 to 0.89 in the same range.
The difference among perturbation methods is best displayed by constructing for
each case a conditional “template” for the immediate neighborhood of the significant
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
400 Data-Driven Fluid Mechanics Chapter 19
(a) (b)
Figure 19.4 Conditional vorticity and velocity distributions, normalized with their
unconditional r.m.s., in the neighborhood of the most significant cells. The test cell is outlined
in blue and, because of the invariances of the problem, the orientation and the vorticity sign are
immaterial. The vorticity contours are spaced by ω/ω00 = 0.3. Solid lines and positive, and
dashed ones are negative. (a) Case A in Table 19.1. (b) Case B. Compiled from 768 flows,
nkeep = 1.
cells of the initial unperturbed flow. Figure 19.4(a,b) includes templates for cases A
and B, built from the 3×3-cell neighborhood of the most significant cell in each exper-
iment. To take into account the reflection and rotational symmetries of the equations
of motion, the template is computed by averaging these flow patches after rotating and
reflecting them so that they mutually agree as much as possible. To compensate for the
effect of the magnitude of the templates, their intensity is scaled to match the global
intensity of the flow before comparing them to individual neighborhoods.
Figure 19.4(a), which represents the conditional structure in the vicinity of cells
which are most sensitive to inverting their vorticity, is an isolated vortex. This would
appear to support the classical view that two-dimensional turbulence is a system con-
trolled by the interactions among individual vortices (McWilliams 1990b, Carnevale
et al. 1991). But the template in Figure 19.4(b), which corresponds to cells that have
been perturbed by inverting their velocity and that are thus best diagnosed by the
magnitude of their kinetic energy, is a vortex dipole. This is a less expected result,
but a reasonable one, because the velocity field of a dipole contains a local jet, and
it makes sense that blocking it has a strong effect. In general, the templates for the
most significant structures in experiments that manipulate the vorticity are isolated
vortices, while the manipulation of the velocity results in dipoles (Jiménez 2020b).
The templates obtained from coarser experimental grids are similar to those for
Nc = 10, but they become progressively less well defined as Nc decreases.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 401
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
402 Data-Driven Fluid Mechanics Chapter 19
1.5 1.5
(a) (b)
1 1
0.5 0.5
0 0.2 0.4 0 1 2 3 4 5
Figure 19.5 (a) Joint p.d.f. of the relative approximation error, defined as the difference
between the templates and the unperturbed initial flow, ku − uT k2 /k uk2 , versus the measured
divergence at ω00 Tref = 4.5. Case B in Table 19.1, plotted against the dipole template in
Figure 19.4(b). The black lines use all the cells in 384 experiments, and the red ones only use
the most significant ones in each experiment. nkeep = 5 and Nc = 10. Contours contain 50%
and 95% of the probability. , Tested using the training set; , using an independent
test set. (b) As in (a), but plotted versus qc2 . The vertical dashed line is the discrimination
threshold in Figure 19.2(b).
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 403
Figure 19.6 Sample segmented vorticity field with pair identification. Blue vortices are
counterclockwise, and red ones are clockwise. Black connectors are corrotating pairs, magenta
are counterrotating dipoles, and yellow markers are unpaired vortices. Note that, because of
their irregular shape, some centers of gravity fall outside their vortex. Vorticity threshold,
|ω| = 0.9ω00 .
choice for the particular experiment in the figure. Both quantities are also correlated,
especially for badly fitting neighborhoods, whose kinetic energy is also low. As with
the perturbation magnitude, the significant cells are just the low-approximation-error
end of the joint p.d.f.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
404 Data-Driven Fluid Mechanics Chapter 19
to move for long distances before they are destroyed by collisions with other objects,
and they stir the flow in the process.
To gain a better sense of the prevalence of dipoles in two-dimensional turbulence,
Figure 19.6 shows a segmentation of a sample flow field into individual positive and
negative vortices. Vortices are defined as connected regions in which |ω| ≥ Hω00 , with
H = 0.9, chosen to maximize the number of individual vortices (Moisy & Jiménez
2004). The vortices in Figure 19.6 have been grouped, whenever possible, into co and
counterrotating pairs. Two vortices are considered a potential pair if their areas differ
by less than a factor of m2 , which is an adjustable parameter. The figure uses m = 2,
but statistics compiled with m = 1.5 and m = 3 show no substantial differences.
Vortices are paired to the closest unpaired neighbor within their area class, and no
vortex can have more than one partner. Some vortices find no suitable partner, and are
left unpaired. Statistics compiled over approximately 8,500 independent flow fields
show that, out of approximately 5 × 105 vortices, 48% are paired to form dipoles, 24%
are in corotating pairs, and 28% are isolated. Most vortices in the flow are thus in the
form of pairs, mostly dipoles. The difference between corotating and counterrotating
pairs is interesting, but the reason is probably that corotating pairs tend to merge into
single cores (Meunier et al. 2005), while modons are long-lasting (Flierl et al. 1980,
McWilliams 1980).
This chapter, together with Chapter 2, have explored two related but independent ideas
in the context of scientific discovery in turbulence research.
The first idea is the well-known distinction between correlation and causation. We
have noted that, because of the difficulty of doing experiments in turbulence (compu-
tational or otherwise), research has tended to center on correlations. Chapter 2 gave
examples of what can be learned from this approach, and noted that the wealth of data
generated by direct simulations in the 1990s probably contributed to that trend by cre-
ating the illusion that “we know everything,” and that all questions can be answered.
There are two problems with this illusion. The first one is that, although large data
sets contains many answers, the probability of finding them without active experi-
mentation is very small. Consider how difficult it would be to study large asteroid
strikes on Earth by observing the configuration of the solar system before spontaneous
strikes happen. These are rare, and it is unclear whether the precursor to one of them
applies to all the others. It is much more efficient to perturb the system (hopefully
computationally) and observe the result. The same is true of many intermittent but
significant processes in turbulence, especially if we are interested in controlling the
flow by introducing local out-of-equilibrium perturbations. Studying them requires
experiments that separate present causes from later effects. We noted in Chapter 2 that
correlation is the tool of prediction, while causation is the tool of control.
The second problem is how to choose which experiments to perform. The classical
search for scientific causation is hypothesis-driven. The entry point to the optimization
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 405
cycle of scientific research is that the researcher thinks of a model, and designs
experiments to test it. While this “hypothesis-driven search” has obvious theoretical
appeal, it risks limiting creativity. A model has to be conceived before it is tested, and
new ideas depend on the imagination of the researchers. However, we have noted at
the beginning of this chapter that faster experimental and computational methods open
the possibility of what we have called Monte Carlo searchers (perhaps describable
as “search-driven hypotheses”) in which experiments are performed randomly, and
evaluated a posteriori in the hope that some of them may be interesting. This is more
expensive than the classical procedure, but may be our best hope of avoiding ingrained
prejudice.
We have illustrated these ideas by the application to two-dimensional turbulence
in Sections 19.2 and 19.3, but the subject of these lectures is not turbulence itself
but the method, and we now briefly discuss how far the hopes expressed earlier have
been realized in the exercise. The first consideration is cost, which is always a pacing
item for turbulence research. The whole program in this paper took about a month
of computer time in a medium-sized department cluster, and was programmed by the
author during his spare time over a year. Considerably more time was spent discussing
with colleagues what should be done than in actually doing it. Since the effort was
conceived as a proof of principle, the problem was chosen small on purpose, but more
interesting problems are within reach. The basic simulation that had to be repeated
many times for the experiments in Section 19.2 runs in about 10 core-seconds, but
even a 2563 simulation of three-dimensional isotropic turbulence can be run in two
minutes in a modern GPU (Jiménez 2020a). A program to address causality in the
three-dimensional turbulence cascade, about which considerably less is known than in
two dimensions, could thus be performed in a modest GPU cluster in a few months.
Again, the main roadblock would be to decide what to do.
The second question is whether something of physical interest has been learned.
As mentioned at the beginning of Section 19.2, little was expected from a problem
that is usually considered to be essentially understood, but the actual results were
interesting. This chapter broadly follows the discussion in Jiménez (2020b), but two
early versions (Jiménez 2018b, Jiménez 2020a) missed the dipoles completely, and
concluded that the experiments confirmed the classical vortex-gas model of two-
dimensional turbulence. The dipole template in Figure 19.4(b) was a mildly surprising
result of further postprocessing and, if we admit that surprise is one of the defining
ingredients of discovery (Schaffer 1994), it was a minor discovery. Unfortunately, we
saw in Section 19.3 that dipoles were not completely unknown components of two-
dimensional flows, and that finding them was rather an instance of recalling something
that had been forgotten than of discovering something truly new. It is clear that the
present chapter needs a lot of extra work before it becomes an article of independent
interest in fluid mechanics, rather than in methodology, but the fact that something
unexpected by the author was found without “prompting” is encouraging for the future
of the Monte Carlo method in problems in which something is genuinely unknown.
The third question is what have we learned about the process of data exploitation.
The scientific method can be seen as an optimization loop to search for the “best”
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
406 Data-Driven Fluid Mechanics Chapter 19
Figure 19.7 The scientific method. Reproduced from Chapter 2 for easy of reference.
hypothesis, and it is natural to ask which parts of it can be automated. Our case
study in Sections 19.2 and 19.3 was conceived as an experimental test of how far
the automation process could be pushed. Several things can be concluded. The first
one is that step S1 in Figure 19.7 (observations) can be largely outsourced to the
computer, including the reference to “asking questions.” The experiments in Table
19.1 were decided (on purpose) with little thought about their significance, although
it is difficult to say how much they were influenced by the previous experience of the
author. The same can be said about the parts of step S3 that refer to testing predictions
against observations. Simulations of two-dimensional turbulence are by now trivial,
and the classification of the results was outsourced to library programs (Figures
19.2 and 19.3). An interesting, although not completely unexpected, outcome of the
experience has been the importance of verification and validation, and a rereading of
Section 19.2.2 and, up to a point, of Section 19.3, shows pitfalls that were avoided this
way. Such problems are expected in any project involving data analysis, but they are
especially dangerous in cases, like the present one in which the goal is to probe the
unknown.
The last point to consider is the model generation and evaluation step (S2) in
Figure 19.7, which is the core of the discovery process. Even here, something was
automated. The templates in Figure 19.4 are flow models, and they were obtained
automatically. Note that it is at this point that the transition from subsymbolic to
symbolic AI took place in this example. Figures 19.2 and 19.3 are primarily useful
for computer classification, but the templates in Figure 19.4 are intended for humans.
On the other hand, the interpretation in Section 19.3 of the templates was entirely
manual, and it is difficult to see at the moment how it could be outsourced to a
computer. This is not only because of the need for a level of intelligence somewhat
above present computer software, but because what are we trying to do is not well
defined. If the goal of a scientific project is to find a “good” model, we need to define
precisely what we consider a good hypothesis. We mentioned in Chapter 2 that one
of the goals of the scientific method was to produce “beautiful” theories, but we do
not know very well how to define scientific beauty. The ultimate models for two-
dimensional turbulence are the Navier–Stokes equations, and the ultimate causes of
any observation are some notional initial conditions. We have restricted ourselves
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
Jiménez: The Computer as Scientist 407
here to a time horizon shorter than the memory loss due to chaos, because of some
generalized interest in flow control. This is a choice that influences the results and
makes them less general, but it defines the metric that allows us to define templates.
Human supervision will probably still be required for some time to refine hypothe-
ses, but Monte Carlo science can contribute something even now. Consider again
the interpretation of the scientific method as an optimization loop to maximize
“understanding.” As such, it could conceivably be implemented as an automatic
optimization procedure (e.g., a neural network). However, most classical optimizers,
including humans, assume local convexity of the cost function, which is at the root of
our misgivings about researcher originality and prejudice. Monte Carlo is a different
way of looking for a maximum, which, in principle, bypasses the convexity constraint
and escapes local maxima by injecting noise in the process (Deb & Gupta 2006). The
best use of Monte Carlo science is probably as a partially randomized search and
classification step, followed by repeated human fine-tuning. The encouraging news in
fluid mechanics is that doing this is now becoming possible.
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
h d i 0 0 9 8 088962 4 02 bli h d li b C b id i i
References 409
References
Abdi, H. (2003), ‘Factor rotations in factor analyses’, Encyclopedia for Research Methods for
the Social Sciences, Sage, Thousand Oaks, CA, pp. 792–795.
Abraham, R., Marsden, J. E., and Marsden, J. E. (1978), Foundations of Mechanics, Benjam-
in/Cummings Publishing Company, Reading, MA.
Abraham, R., Marsden, J. E., and Ratiu, T. (1988), Manifolds, Tensor Analysis, and Applica-
tions, Vol. 75 of Applied Mathematical Sciences, Springer-Verlag, New York.
Abu-Mostafa, Y. S., Magndon-Ismail, M., and Lin, H.-T. (2012), Learning from Data: A Short
Course, https://amlbook.com/.
Abu-Zurayk, M., Ilic, C., Schuster, A., and Liepelt, R. (2017), Effect of gradient approximations
on aero-structural gradient-based wing optimization, in ‘EUROGEN 2017’, September 13–
15, 2017, Madrid, Spain.
Adrian, R. J. (1975), On the role of conditional averages in turbulence theory, in ‘Proceedings of
the 4th Biennial Symposium on Turbulence in Liquids, University of Missouri-Rolla, 1975’,
Science Press, Princeton, NJ.
Adrian, R. J. (1991), ‘Particle-imaging techniques for experimental fluid mechanics’, Annual
Review of Fluid Mechanics 23(1), 261–304.
Adrian, R. J. (2007), ‘Hairpin vortex organization in wall turbulence’, Physics of Fluids
19, 041301.
Adrian, R. J., Jones, B., Chung, M., Hassan, Y., Nithianandan, C., and Tung, A.-C. (1989),
‘Approximation of turbulent conditional averages by stochastic estimation’, Physics of
Fluids A: Fluid Dynamics 1(6), 992–998.
Ahuja, S. and Rowley, C. W. (2010), ‘Feedback control of unstable steady states of flow past a
flat plate using reduced-order estimators’, Journal of Fluid Mechanics 645, 447–478.
Albers, M., Meysonnat, P. S., Fernex, D., Semaan, R., Noack, B. R., and Schröder, W. (2020),
‘Drag reduction and energy saving by spanwise traveling transversal surface waves for flat
plate flow’, Flow, Turbulence and Combustion 105, 125–157.
Aleksic, K., Luchtenburg, D. M., King, R., Noack, B. R., and Pfeiffer, J. (2010), Robust
nonlinear control versus linear model predictive control of a bluff body wake, in ‘5th AIAA
Flow Control Conference’, June 28–July 1, 2010, Chicago, AIAA-Paper 2010-4833, pp.
1–18.
Allaire, G. and Schoenauer, M. (2007), Conception optimale de structures, Vol. 58 of
Mathématiques et Applications, Springer, Berlin.
Alsalman, M., Colvert, B., and Kanso, E. (2018), ‘Training bioinspired sensors to classify
flows’, Bioinspiration & Biomimetics 14(1), 016009.
Amsallem, D., Zahr, M. J., and Farhat, C. (2012), ‘Nonlinear model order reduction based on
local reduced-order bases’, International Journal for Numerical Methods in Engineering
92(10), 891–916.
Antoranz, A., Gonzalo, A., Flores, O., and Garcia-Villalba, M. (2015), ‘Numerical simulation of
heat transfer in a pipe with non-homogeneous thermal boundary conditions’, International
Journal of Heat and Fluid Flow 55, 45–51.
Antoranz, A., Ianiro, A., Flores, O., and Garcı́a-Villalba, M. (2018), ‘Extended proper
orthogonal decomposition of non-homogeneous thermal fields in a turbulent pipe flow’,
International Journal of Heat and Mass Transfer 118, 1264–1275.
Antoulas, A. C. (2005), Approximation of Large-Scale Dynamical Systems, Society for Indus-
trial and Applied Mathematics, Philadelphia.
Arbabi, H. and Mezić, I. (2017), ‘Ergodic theory, dynamic mode decomposition, and computa-
tion of spectral properties of the Koopman operator’, SIAM Journal on Applied Dynamical
Systems 16(4), 2096–2126.
Armstrong, E. and Sutherland, J. C. (2021), ‘A technique for characterizing feature size and
quality of manifolds’, Combustion Theory and Modelling, 25, 646–668.
Arthur, D. and Vassilvitskii, S. (2007), k-means++: The advantages of careful seeding, in
‘Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms’, Society
for Industrial and Applied Mathematics, Philadelphia, pp. 1027–1035.
Ashurst, W. T. and Meiburg, E. (1988), ‘Three-dimensional shear layers via vortex dynamics’,
Journal of Fluid Mechanics 189, 87–116.
Aström, K. J. and Murray, R. M. (2010), Feedback Systems: An Introduction for Scientists and
Engineers, Princeton University Press, Princeton, NJ.
Aubry, N., Holmes, P., Lumley, J. L., and Stone, E. (1988), ‘The dynamics of coherent structures
in the wall region of a turbulent boundary layer’, Journal of Fluid Mechanics 192(1), 115–
173.
Baars, W. J. and Tinney, C. E. (2014), ‘Proper orthogonal decomposition-based spectral higher-
order stochastic estimation’, Physics of Fluids 26(5), 055112.
Babaee, H. and Sapsis, T. P. (2016), ‘A variational principle for the description of time-
dependent modes associated with transient instabilities’, Philosophical Transactions of the
Royal Society of London 472, 20150779.
Bagheri, S., Brandt, L., and Henningson, D. (2009), ‘Input–output analysis, model reduction
and control of the flat-plate boundary layer’, Journal of Fluid Mechanics 620, 263–298.
Bagheri, S., Hoepffner, J., Schmid, P. J., and Henningson, D. S. (2009), ‘Input–output analysis
and control design applied to a linear model of spatially developing flows’, Applied
Mechanics Reviews 62(2), 020803-1-020803-27.
Balajewicz, M., Dowell, E. H., and Noack, B. R. (2013), ‘Low-dimensional modelling of high-
Reynolds-number shear flows incorporating constraints from the Navier-Stokes equation’,
Journal of Fluid Mechanics 729, 285–308.
Baldi, P. and Hornik, K. (1989), ‘Neural networks and principal component analysis: Learning
from examples without local minima’, Neural Networks 2(1), 53–58.
Banerjee, I. and Ierapetritou, M. (2006), ‘An adaptive reduction scheme to model reactive flow’,
Combustion and Flame 144(3), 619–633.
Baraniuk, R. G. (2007), ‘Compressive sensing’, IEEE Signal Processing Magazine 24(4), 118–
120.
Barbagallo, A., Sipp, D., and Schmid, P. J. (2009), ‘Closed-loop control of an open cavity flow
using reduced-order models’, Journal of Fluid Mechanics 641, 1–50.
Barkley, D. and Henderson, R. (1996), ‘Three-dimensional Floquet stability analysis of the
wake of a circular cylinder’, Journal of Fluid Mechanics 322, 215–241.
Barlow, R. S. and Frank, J. H. (1998), ‘Effects of turbulence on species mass fractions in
methane/air jet flames’, Symposium (International) on Combustion 27, 1087–1095.
Batchelor, G. K. (1969), ‘Computation of the energy spectrum in homogeneous two dimensional
turbulence’, Physics of Fluids 12 (Suppl. II), 233–239.
Battaglia, P. W., Hamrick, J. B., Bapst, V., Sanchez-Gonzalez, A., Zambaldi, V., Malinowski,
M., Tacchetti, A., Raposo, D., Santoro, A., Faulkner, R. Gulcehre, C., Song, F., Ballard,
A., Gilmer, J., Dahl, G., Vaswani, A., Allen, K., Nash, C., Langston, V., Dyer, C., Heess, N.,
Wierstra, D., Kohli, P., Botvinick, M., Vinyals, O., Li, Y., and Pascanu, R. (2018), ‘Relational
inductive biases, deep learning, and graph networks’, arXiv preprint arXiv:1806.01261.
Beerends, R. J., ter Morsche, H. G., van den Berg, J. C., and van de Vrie, E. M. (2003), Fourier
and Laplace Transforms, Cambridge University Press, New York.
Beetham, S. and Capecelatro, J. (2020), ‘Formulating turbulence closures using sparse regres-
sion with embedded form invariance’, arXiv preprint arXiv:2003.12884 .
Beintema, G., Corbetta, A., Biferale, L., and Toschi, F. (2020), ‘Controlling Rayleigh–Bénard
convection via reinforcement learning’, Journal of Turbulence 21(9–10), 585–605.
Bekemeyer, P., Ripepi, M., Heinrich, R., and Görtz, S. (2019), ‘Nonlinear unsteady reduced-
order modeling for gust-load predictions’, AIAA Journal 57(5), 1839–1850.
Bekemeyer, P., Wunderlich, T., Görtz, S., and Dähne, D. (2019), Effect of gradient approx-
imations on aero-structural gradient-based wing optimization, in ‘International Forum on
Aeroelasticity and Structural Dynamics (IFASD-2019)’, June 9–13, 2019, Savannah, GA.
Bekemeyer, P., Bertram, A., Chaves, D. A. H., Ribeiro, M. D., Garbo, A., Kiener, A., Sabater,
C., Stradtner, M., Wassing, S., Widhalm, M., Goertz, S., Jaeckel, F., Hoppe, R. & Hoffmann,
N. (2022), ‘Data-driven aerodynamic modeling using the DLR SMARTy toolbox’.
Bellemans, A., Aversano, G., Coussement, A., and Parente, A. (2018), ‘Feature extraction and
reduced-order modelling of nitrogen plasma models using principal component analysis’,
Computers & Chemical Engineering 115, 504–514.
Bellemare, M. G., Dabney, W., and Munos, R. (2017), A distributional perspective on
reinforcement learning, in ‘Proceedings of the 34th International Conference on Machine
Learning – Volume 70’, ICML 17, JMLR.org, pp. 449–458.
Bellman, R. (1957), ‘A Markovian decision process’, Journal of Mathematics and Mechanics
6(5), 679–684.
Belson, B. A., Semeraro, O., Rowley, C. W., and Henningson, D. S. (2013), ‘Feedback control
of instabilities in the two-dimensional Blasius boundary layer: The role of sensors and
actuators’, Physics of Fluids (1994–present) 25(5), 054106.
Belson, B. A., Tu, J. H., and Rowley, C. W. (2014), ‘Algorithm 945: Modred – A parallelized
model reduction library’, ACM Transactions on Mathematical Software (TOMS) 40(4), 30.
Belus, V., Rabault, J., Viquerat, J., Che, Z., Hachem, E., and Reglade, U. (2019), ‘Exploiting
locality and translational invariance to design effective deep reinforcement learning control
of the 1-dimensional unstable falling liquid film’, AIP Advances 9(12), 125014.
Bence, J. R. (1995), ‘Analysis of short time series: Correcting for autocorrelation’, Ecology
76(2), 628–639.
Benner, P., Gugercin, S., and Willcox, K. (2015), ‘A survey of projection-based model reduction
methods for parametric dynamical systems’, SIAM Review 57(4), 483–531.
Bergmann, M., Cordier, L., and Brancher, J.-P. (2005), ‘Optimal rotary control of the cylinder
wake using proper orthogonal decomposition reduced-order model’, Physics of Fluids
(1994–present) 17(9), 097101.
Berkooz, G., Holmes, P., and Lumley, J. L. (1993), ‘The proper orthogonal decomposition in
the analysis of turbulent flows’, Annual Review of Fluid Mechanics 23, 539–575.
Bernal, L. P. (1981), The coherent structure of turbulent mixing layers, PhD thesis, California
Institute of Technology, Pasadena.
Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P., and Plemmons, R. J. (2007), ‘Algo-
rithms and applications for approximate nonnegative matrix factorization’, Computational
Statistics & Data Analysis 52(1), 155–173.
Bertolotti, F., Herbert, T., and Spalart, P. (1992), ‘Linear and nonlinear stability of the Blasius
boundary layer’, Journal of Fluid Mechanics 242, 441–474.
Betchov, R. (1956), ‘An inequality concerning the production of vorticity in isotropic turbu-
lence’, Journal of Fluid Mechanics 1, 497–504.
Bieker, K., Peitz, S., Brunton, S. L., Kutz, J. N., and Dellnitz, M. (2019), ‘Deep model
predictive control with online learning for complex physical systems’, arXiv preprint
arXiv:1905.10094.
Biglari, A. and Sutherland, J. C. (2012), ‘A filter-independent model identification technique
for turbulent combustion modeling’, Combustion and Flame 159(5), 1960–1970.
Biglari, A. and Sutherland, J. C. (2015), ‘An a-posteriori evaluation of principal component
analysis-based models for turbulent combustion simulations’, Combustion and Flame
162(10), 4025–4035.
Bilger, R., Starner, S., and Kee, R. (1990), ‘On reduced mechanisms for methane-air combustion
in nonpremixed flames’, Combustion and Flame 80, 135–149.
Billings, S. A. (2013), Nonlinear System Identification: NARMAX Methods in the Time,
Frequency, and Spatio-Temporal Domains, John Wiley & Sons, Hoboken, NJ.
Bishop, C. M. (2016), Pattern Recognition and Machine Learning, Springer, New York.
Bistrian, D. and Navon, I. (2016), ‘Randomized dynamic mode decomposition for non-intrusive
reduced order modelling’, International Journal for Numerical Methods in Engineering 112,
3–25.
Black, F., Schulze, P., and Unger, B. (2019), ‘Nonlinear Galerkin model reduction for systems
with multiple transport velocities’, arXiv preprint arXiv:1912.11138.
Boffetta, G. and Ecke, R. E. (2012), ‘Two-dimensional turbulence’, The Annual Review of Fluid
Mechanics 44, 427–451.
Bohn, E., Coates, E. M., Moe, S., and Johansen, T. A. (2019), Deep reinforcement learning
attitude control of fixed-wing UAVs using proximal policy optimization, in ‘2019 Interna-
tional Conference on Unmanned Aircraft Systems (ICUAS)’, June 11–14, 2019, Atlanta,
GA, IEEE, pp. 523–533.
Bongard, J. and Lipson, H. (2007), ‘Automated reverse engineering of nonlinear dynamical
systems’, Proceedings of the National Academy of Sciences 104(24), 9943–9948.
Borée, J. (2003), ‘Extended proper orthogonal decomposition: A tool to analyse correlated
events in turbulent flows’, Experiments in Fluids 35(2), 188–192.
Bourgeois, J. A., Martinuzzi, R. J., and Noack, B. R. (2013), ‘Generalised phase average with
applications to sensor-based flow estimation of the wall-mounted square cylinder wake’,
Journal of Fluid Mechanics 736, 316–350.
Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M. (2015), Time Series Analysis:
Forecasting and Control, John Wiley & Sons, Hoboken, NJ.
Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011), ‘Distributed optimization
and statistical learning via the alternating direction method of multipliers’, Foundations and
Trends in Machine Learning 3, 1–122.
Brackston, R., De La Cruz, J. G., Wynn, A., Rigas, G., and Morrison, J. (2016), ‘Stochastic
modelling and feedback control of bistability in a turbulent bluff body wake’, Journal of
Fluid Mechanics 802, 726–749.
Breiman, L. (2001), ‘Random forests’, Machine Learning 45(1), 5–32.
Brenner, M., Eldredge, J., and Freund, J. (2019a), ‘Perspective on machine learning for
advancing fluid mechanics’, Physical Review Fluids 4(10), 100501.
Bright, I., Lin, G., and Kutz, J. N. (2013), ‘Compressive sensing and machine learning strategies
for characterizing the flow around a cylinder with limited pressure measurements’, Physics
of Fluids 25(127102), 1–15.
Brockwell, P. J. and Davis, R. A. (2010), Introduction to Time Series and Forecasting (Springer
Texts in Statistics), Springer, New York.
Brooks, R. (1990), ‘Elephants don’t play chess’, Robotics and Autonomous Systems 6, 3–15.
Brown, G. L. and Roshko, A. (1974), ‘On the density effects and large structure in turbulent
mixing layers’, Journal of Fluid Mechanics 64, 775–816.
Brunton, B. W., Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2016a), ‘Sparse sensor
placement optimization for classification’, SIAM Journal on Applied Mathematics 76(5),
2099–2122.
Brunton, S., Brunton, B., Proctor, J., Kaiser, E., and Kutz, J. (2017), ‘Chaos as an intermittently
forced linear system’, Nature Communication 8(19), 1–9.
Brunton, S. L., Brunton, B. W., Proctor, J. L., and Kutz, J. N. (2016b), ‘Koopman invariant
subspaces and finite linear representations of nonlinear dynamical systems for control’, PLoS
ONE 11(2), e0150171.
Brunton, S. L., Dawson, S. T. M., and Rowley, C. W. (2014), ‘State-space model identification
and feedback control of unsteady aerodynamic forces’, Journal of Fluids and Structures
50, 253–270.
Brunton, S. L., Hemati, M. S., and Taira, K. (2020), ‘Special issue on machine learning and
data-driven methods in fluid dynamics’, Theoretical and Computational Fluid Dynamics 34,
333–337.
Brunton, S. L. and Kutz, J. N. (2019), Data-Driven Science and Engineering: Machine
Learning, Dynamical Systems, and Control, Cambridge University Press, Cambridge.
Brunton, S. L. and Noack, B. R. (2015), ‘Closed-loop turbulence control: Progress and
challenges’, Applied Mechanics Reviews 67(5), 050801:01–48.
Brunton, S. L., Noack, B. R., and Koumoutsakos, P. (2020), ‘Machine learning for fluid
mechanics’, Annual Review of Fluid Mechanics 52, 477–508.
Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2016a), ‘Discovering governing equations from
data by sparse identification of nonlinear dynamical systems’, Proceedings of the National
Academy of Sciences 113(15), 3932–3937.
Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2016b), ‘Sparse identification of nonlinear
dynamics with control (SINDYc)’, IFAC NOLCOS 49(18), 710–715.
Bucci, M. A., Semeraro, O., Allauzen, A., Wisniewski, G., Cordier, L., and Mathelin, L. (2019),
‘Control of chaotic systems by deep reinforcement learning’, Proceedings of the Royal
Society A: Mathematical, Physical and Engineering Sciences 475(2231), 20190351.
Budišić, M. and Mezić, I. (2009), An approximate parametrization of the ergodic partition using
time averaged observables, in ‘Proceedings of the 48th IEEE Conference on Decision and
Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009.
December 15–18, 2009, Shanghai, China, IEEE, Piscataway, NJ, pp. 3162–3168.
Budišić, M. and Mezić, I. (2012), ‘Geometry of the ergodic quotient reveals coherent structures
in flows’, Physica D: Nonlinear Phenomena 241(15), 1255–1269.
Budišić, M., Mohr, R. and Mezić, I. (2012), ‘Applied Koopmanism’, Chaos: An Interdisci-
plinary Journal of Nonlinear Science 22(4), 047510.
Burda, Y., Edwards, H., Storkey, A. J., and Klimov, O. (2019), Exploration by random network
distillation, in ‘7th International Conference on Learning Representations, ICLR 2019, New
Orleans, LA, USA, May 6–9, 2019’, OpenReview.net.
Burkardt, J., Gunzburger, M., and Lee, H.-C. (2006), ‘POD and CVT-based reduced-order mod-
eling of Navier–Stokes flows’, Computer Methods in Applied Mechanics and Engineering
196(1–3), 337–355.
Burkov, A. (2019), The Hundred-Page Machine Learning Book, Andriy Burkov Quebec City.
Busse, F. (1991), ‘Numerical analysis of secondary and tertiary states of fluid flow and their
stability properties’, Applied Scientific Research 48(3), 341–351.
Butler, K. and Farrell, B. (1992), ‘Three-dimensional optimal perturbations in viscous shear
flow’, Physics of Fluids 4, 1637–1650.
Butler, K. M. and Farrell, B. F. (1993), ‘Optimal perturbations and streak spacing in wall-
bounded shear flow’, Physics of Fluids A 5, 774–777.
Callaham, J., Maeda, K., and Brunton, S. L. (2019), ‘Robust reconstruction of flow fields from
limited measurements’, Physical Review Fluids 4, 103907.
Cammilleri, A., Gueniat, F., Carlier, J., Pastur, L., Memin, E., Lusseyran, F., and Artana,
G. (2013), ‘POD-spectral decomposition for fluid flow analysis and model reduction’,
Theoretical and Computational Fluid Dynamics 27(6), 787–815.
Candès, E. J. (2006), Compressive sensing, in ‘Proceedings of the International Congress of
Mathematics, August 22–30, 2006’, European Mathematical Society, Zurich.
Candès, E. J., Romberg, J., and Tao, T. (2006a), ‘Robust uncertainty principles: Exact signal
reconstruction from highly incomplete frequency information’, IEEE Transactions on
Information Theory 52(2), 489–509.
Candès, E. J., Romberg, J., and Tao, T. (2006b), ‘Stable signal recovery from incomplete and
inaccurate measurements’, Communications in Pure and Applied Mathematics 8, 1207–
1223.
Candès, E. J. and Tao, T. (2006), ‘Near optimal signal recovery from random projections:
Universal encoding strategies?’, IEEE Transactions on Information Theory 52(12), 5406–
5425.
Cardesa, J. I., Vela-Martı́n, A., and Jiménez, J. (2017), ‘The turbulent cascade in five
dimensions’, Science 357, 782–784.
Carlberg, K., Barone, M., and Antil, H. (2017), ‘Galerkin v. least-squares Petrov–Galerkin
projection in nonlinear model reduction’, Journal of Computational Physics 330, 693–734.
Carlberg, K., Bou-Mosleh, C., and Farhat, C. (2011), ‘Efficient non-linear model reduction
via a least-squares Petrov–Galerkin projection and compressive tensor approximations’,
International Journal for Numerical Methods in Engineering 86(2), 155–181.
Carlberg, K., Tuminaro, R., and Boggs, P. (2015), ‘Preserving Lagrangian structure in nonlinear
model reduction with application to structural dynamics’, SIAM Journal on Scientific
Computing 37(2), B153–B184.
Carleman, T. (1932), ‘Application de la théorie des équations intégrales linéaires aux systémes
d’équations différentielles non linéaires’, Acta Mathematica 59(1), 63–87.
Carnevale, G. F., McWilliams, J. C., Pomeau, Y., Weiss, J. B., and Young, W. R. (1991),
‘Evolution of vortex statistics in two-dimensional turbulence’, Physical Review Letters
66, 2735–2737.
Cassel, K. (2013), Variational Methods with Applications in Science and Engineering,
Cambridge University Press, Cambridge.
Cattafesta III, L. N. and Sheplak, M. (2011), ‘Actuators for active flow control’, Annual Review
of Fluid Mechanics 43, 247–272.
Cattell, R. B. (1966), ‘The scree test for the number of factors’, Multivariate Behavioral
Research 1(2), 245–276.
Cavaliere, A. and de Joannon, M. (2004), ‘Mild combustion’, Progress in Energy and
Combustion Science 30, 329–366.
Champion, K., Lusch, B., Kutz, J. N., and Brunton, S. L. (2019), ‘Data-driven discovery of
coordinates and governing equations’, Proceedings of the National Academy of Sciences
116(45), 22445–22451.
Champion, K., Zheng, P., Aravkin, A. Y., Brunton, S. L., and Kutz, J. N. (2020), ‘A unified
sparse optimization framework to learn parsimonious physics-informed models from data’,
IEEE Access 8, 169259–169271.
Chang, H., Yeung, D.-Y., and Xiong, Y. (2004), Super-resolution through neighbor embedding,
in ‘Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, 2004. CVPR 2004.’, Vol. 1, IEEE, pp. I–I.
Chartrand, R. (2011), ‘Numerical differentiation of noisy, nonsmooth data’, ISRN Applied
Mathematics 2011, article ID 164564.
Chen, K. K. and Rowley, C. W. (2011), ‘H2 optimal actuator and sensor placement in the
linearised complex Ginzburg-Landau system’, Journal of Fluid Mechanics 681, 241–260.
Chen, K. K., Tu, J. H., and Rowley, C. W. (2012), ‘Variants of dynamic mode decomposi-
tion: Boundary condition, Koopman, and Fourier analyses’, Journal of Nonlinear Science
22(6), 887–915.
Choi, H., Moin, P., and Kim, J. (1994), ‘Active turbulence control for drag reduction in wall-
bounded flows’, Journal of Fluid Mechanics 262, 75–110.
Chui, C. K. (1992), An Introduction to Wavelets, Academic Press, San Diego, CA.
Cimbala, J. M., Nagib, H. M., and Roshko, A. (1988), ‘Large structure in the far wakes of
two-dimensional bluff bodies’, Journal of Fluid Mechanics 190, 265–298.
Citriniti, J. H. and George, W. K. (2000), ‘Reconstruction of the global velocity field in the
axisymmetric mixing layer utilizing the proper orthogonal decomposition’, Journal of Fluid
Mechanics 418, 137–166.
Cohen, L. (1995), Time-Frequency Analysis, Prentice Hall, Englewood Cliffs, NJ.
Colabrese, S., Gustavsson, K., Celani, A., and Biferale, L. (2017), ‘Flow navigation by smart
microswimmers via reinforcement learning’, Physical Review Letters 118(15), 158004.
Colvert, B., Alsalman, M., and Kanso, E. (2018), ‘Classifying vortex wakes using neural
networks’, Bioinspiration & Biomimetics 13(2), 025003.
Cooley, J. W. and Tukey, J. W. (1965), ‘An algorithm for the machine calculation of complex
Fourier series’, Mathematics of Computation 19(90), 297–301.
Cordier, L., El Majd, B. A., and Favier, J. (2010), ‘Calibration of POD reduced-order models
using Tikhonov regularization’, International Journal for Numerical Methods in Fluids
63(2), 269–296.
Cordier, L., Noack, B. R., Daviller, G., Delvile, J., Lehnasch, G., Tissot, G., Balajewicz, M., and
Niven, R. (2013), ‘Control-oriented model identification strategy’, Experiments in Fluids
54, Article 1580.
Corino, E. R. and Brodkey, R. S. (1969), ‘A visual investigation of the wall region in turbulent
flow’, Journal of Fluid Mechanics 37, 1.
Cornejo Maceda, G. Y., Li, Y., Lusseyran, F., Morzyński, M., and Noack, B. R. (2021),
‘Stabilization of the fluidic pinball with gradient-based machine learning control’, Journal
of Fluid Mechanics 917, A42, 1–43.
Cornejo Maceda, G., Bernd R. Noack, Lusseyran, F., Deng, N., Pastur, L., and Morzyński,
M. (2019), ‘Artificial intelligence control applied to drag reduction of the fluidic pinball’,
Proceedings in Applied Mathematics and Mechanics 19(1), e201900268:1–2.
Corrsin, S. (1958), Local isotropy in turbulent shear flow, Res. Memo 58B11, National Advisory
Committee for Aeronautics, Washington, DC.
Coussement, A., Gicquel, O., and Parente, A. (2013), ‘MG-local-PCA method for reduced order
combustion modeling’, Proceedings of the Combustion Institute 34(1), 1117–1123.
Coussement, A., Isaac, B. J., Gicquel, O., and Parente, A. (2016), ‘Assessment of different
chemistry reduction methods based on principal component analysis: Comparison of the
MG-PCA and score-PCA approaches’, Combustion and Flame 168, 83–97.
Coveney, P. V., Dougherty, E. R., and Highfield, R. R. (2016), ‘Big data need big theory too’,
Philosophical Transactions of the Royal Society A 374, 20160153.
Cox, T. F. and Cox, M. A. A. (2000), Multidimensional Scaling, Vol. 88 of Monographs on
Statistics and Applied Probability, 2nd ed., Chapman & Hall, London.
Cranmer, M. D., Xu, R., Battaglia, P., and Ho, S. (2019), ‘Learning symbolic physics with graph
networks’, arXiv preprint arXiv:1909.05862.
Cranmer, M., Greydanus, S., Hoyer, S., Battaglia, P., Spergel, D., and Ho, S. (2020),
‘Lagrangian neural networks’, arXiv preprint arXiv:2003.04630.
Cristianini, N. and Shawe–Taylor, J. (2000), An Introduction to Support Vector Machines and
Other Kernel-based Learning Methods, Cambridge University Press, Cambridge.
Cuoci, A., Frassoldati, A., Faravelli, T., and Ranzi, E. (2013), ‘Numerical modeling of laminar
flames with detailed kinetics based on the operator-splitting method’, Energy & Fuels
27(12), 7730–7753.
Cuoci, A., Frassoldati, A., Faravelli, T., and Ranzi, E. (2015), ‘OpenSMOKE++: An object-
oriented frame-work for the numerical modeling of reactive systems with detailed kinetic
mechanisms’, Computer Physics Communications 192, 237–264.
Dalakoti, D. K., Wehrfritz, A., Savard, B., Day, M. S., Bell, J. B., and Hawkes, E. R. (2020), ‘An
a priori evaluation of a principal component and artificial neural network based combustion
model in diesel engine conditions’, Proceedings of the Combustion Institute 38, 2701–2709.
D’Alessio, G., Attili, A., Cuoci, A., Pitsch, H., and Parente, A. (2020a), Analysis of turbulent
reacting jets via principal component analysis, in ‘Data Analysis for Direct Numerical
Simulations of Turbulent Combustion’, Springer, Cham, pp. 233–251.
D’Alessio, G., Attili, A., Cuoci, A., Pitsch, H., and Parente, A. (2020b), Unsupervised data
analysis of direct numerical simulation of a turbulent flame via local principal component
analysis and procustes analysis, in ‘15th International Workshop on Soft Computing Models
in Industrial and Environmental Applications’, September 16–18, 2020, Burgos, Spain,
Springer Nature, Switzerland, pp. 460–469.
D’Alessio, G., Cuoci, A., Aversano, G., Bracconi, M., Stagni, A., and Parente, A. (2020),
‘Impact of the partitioning method on multidimensional adaptive-chemistry simulations’,
Energies 13(10), 2567.
D’Alessio, G., Cuoci, A., and Parente, A. (2020), ‘OpenMORe: A Python framework for
reduction, clustering and analysis of reacting flow data’, SoftwareX 12, 100630.
D’Alessio, G., Parente, A., Stagni, A., and Cuoci, A. (2020), ‘Adaptive chemistry via pre-
partitioning of composition space and mechanism reduction’, Combustion and Flame
211, 68–82.
Dam, M., Brøns, M., Juul Rasmussen, J., Naulin, V., and Hesthaven, J. S. (2017), ‘Sparse
identification of a predator-prey system from simulation data of a convection model’, Physics
of Plasmas 24(2), 022310.
Daubechies, I. (1990), ‘The wavelet transform, time-frequency localization and signal analysis’,
IEEE Transactions on Information Theory 36(5), 961–1005.
Daubechies, I. (1992), Ten Lectures on Wavelets, Vol. 61, Society for Industrial and Applied
Mathematics, Philadelphia.
Daubechies, I., Devore, R., Fornasier, M., and Güntürk, C. (2010), ‘Iteratively reweighted
least squares minimization for sparse recovery’, Communications on Pure and Applied
Mathematics 63, 1–38.
de Silva, B., Higdon, D. M., Brunton, S. L., and Kutz, J. N. (2019), ‘Discovery of physics from
data: Universal laws and discrepancy models’, arXiv preprint arXiv:1906.07906.
Deane, A. E., Kevrekidis, I. G., Karniadakis, G. E., and Orszag, S. A. (1991), ‘Low-
dimensional models for complex geometry flows: Application to grooved channels and
circular cylinders’, Physics of Fluids A 3, 2337–2354.
Deb, K. and Gupta, H. (2006), ‘Introducing robustness in multi-objective optimization’,
Evolutionary Computation 14, 463–494.
Debien, A., von Krbek, K. A., Mazellier, N., Duriez, T., Cordier, L., Noack, B. R., Abel, M. W.,
and Kourta, A. (2016), ‘Closed-loop separation control over a sharp edge ramp using genetic
programming’, Experiments in Fluids 57(3), 40.
del Álamo, J. C., Jiménez, J., Zandonade, P., and Moser, R. D. (2006), ‘Self-similar vortex
clusters in the logarithmic region’, Journal of Fluid Mechanics 561, 329–358.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009), Imagenet: A large-scale
hierarchical image database, in ‘2009 IEEE Conference on Computer Vision and Pattern
Recognition’, June 20–25, 2009, Miami, FL, IEEE, pp. 248–255.
Deng, N., Noack, B. R., Morzyński, M., and Pastur, L. R. (2020), ‘Low-order model for
successive bifurcations of the fluidic pinball’, Journal of Fluid Mechanics 884, A37.
Discetti, S., Bellani, G., Örlü, R., Serpieri, J., Vila, C. S., Raiola, M., Zheng, X., Mascotelli,
L., Talamelli, A., and Ianiro, A. (2019), ‘Characterization of very-large-scale motions in
high-Re pipe flows’, Experimental Thermal and Fluid Science 104, 1–8.
Discetti, S. and Ianiro, A. (2017), Experimental Aerodynamics, CRC Press, Boca Raton, FL.
Discetti, S., Raiola, M., and Ianiro, A. (2018), ‘Estimation of time-resolved turbulent fields
through correlation of non-time-resolved field measurements and time-resolved point
measurements’, Experimental Thermal and Fluid Science 93, 119–130.
Distefano, J. (2013), Schaum’s Outline of Feedback and Control Systems, McGraw-Hill
Education, New York.
Dong, C., Loy, C. C., He, K., and Tang, X. (2014), Learning a deep convolutional network for
image super-resolution, in ‘European Conference on Computer Vision’, September 6–12,
2014, Zurich, Switzerland, Springer, Cham, pp. 184–199.
Dong, S., Lozano-Durán, A., Sekimoto, A., and Jiménez, J. (2017), ‘Coherent structures
in statistically stationary homogeneous shear turbulence’, Journal of Fluid Mechanics
816, 167–208.
Donoho, D. L. (1995), ‘De-noising by soft-thresholding’, IEEE Transactions on Information
Theory 41(3), 613–627.
Donoho, D. L. (2006), ‘Compressed sensing’, IEEE Transactions on Information Theory
52(4), 1289–1306.
Donoho, D. L. and Johnstone, J. M. (1994), ‘Ideal spatial adaptation by wavelet shrinkage’,
Biometrika 81(3), 425–455.
Dracopoulos, D. C. (1997), Evolutionary Learning Algorithms for Neural Adaptive Control,
Perspectives in Neural Computing, Springer-Verlag, London.
Drazin, P. and Reid, W. (1981), Hydrodynamic Stability, Cambridge University Press,
Cambridge, UK.
Dullerud, G. E. and Paganini, F. (2000), A Course in Robust Control Theory: A Convex
Approach, Texts in Applied Mathematics, Springer, Berlin, Heidelberg.
Duraisamy, K., Iaccarino, G., and Xiao, H. (2019), ‘Turbulence modeling in the age of data’,
Annual Reviews of Fluid Mechanics 51, 357–377.
Duriez, T., Brunton, S. L., and Noack, B. R. (2017), Machine Learning Control: Taming
Nonlinear Dynamics and Turbulence, Springer, Cham.
Duwig, C. and Iudiciani, P. (2010), ‘Extended proper orthogonal decomposition for analysis of
unsteady flames’, Flow, Turbulence and Combustion 84(1), 25.
Echekki, T., Kerstein, A. R., and Sutherland, J. C. (2011), The one-dimensional-turbulence
model, in T. Echekki & E. Mastorakos, eds., ‘Turbulent Combustion Modeling’, Springer,
Dordrecht, pp. 249–276.
Echekki, T. and Mirgolbabaei, H. (2015), ‘Principal component transport in turbulent combus-
tion: A posteriori analysis’, Combustion and Flame 162(5), 1919–1933.
Eckart, C. and Young, G. (1936), ‘The approximation of one matrix by another of lower rank’,
Psychometrika 1(3), 211–218.
Ehlert, A., Nayeri, C. N., Morzyński, M., and Noack, B. R. (2020), ‘Locally linear embedding
for transient cylinder wakes’. arXiv manuscript 1906.07822 [physics.flu-dyn].
El Sayed M, Y., Semaan, R., and Radespiel, R. (2018), Sparse modeling of the lift gains of a
high-lift configuration with periodic coanda blowing, in ‘2018 AIAA Aerospace Sciences
Meeting’, January 8–12, 2018, Kissimmee, FL, p. 1054.
Elman, J. L. (1990), ‘Finding structure in time’, Cognitive Science 14(2), 179–211.
Encinar, M. P. and Jiménez, J. (2020), ‘Momentum transfer by linearised eddies in turbulent
channel flows’, arXiv e-prints p. 1911.06096.
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Janoos, F., Rudolph, L., and Madry,
A. (2020), Implementation matters in deep RL: A case study on PPO and TRPO, in
‘International Conference on Learning Representations’ April 26–30, 2020, Addis Ababa,
Ethiopia.
Epps, B. P. and Krivitzky, E. M. (2019), ‘Singular value decomposition of noisy data: Noise
filtering’, Experiments in Fluids 60(8), 126.
Erichson, N. B., Mathelin, L., Kutz, J. N., and Brunton, S. L. (2019), ‘Randomized dynamic
mode decomposition’, SIAM Journal on Applied Dynamical Systems 18(4), 1867–1891.
Erichson, N. B., Mathelin, L., Yao, Z., Brunton, S. L., Mahoney, M. W., and Kutz, J. N. (2020),
‘Shallow neural networks for fluid flow reconstruction with limited sensors’, Proceedings of
the Royal Society A 476(2238), 20200097.
Espeholt, L., Soyer, H., Munos, R., Simonyan, K., Mnih, V., Ward, T., Doron, Y., Firoiu,
V., Harley, T., Dunning, I., Legg, S., and Kavukcuoglu, K. (2018), IMPALA: Scalable
distributed deep-RL with importance weighted actor-learner architectures, in J. Dy and
A. Krause, eds., ‘Proceedings of the 35th International Conference on Machine Learning’,
Vol. 80 of Proceedings of Machine Learning Research, PMLR, Stockholmsmassan, Stock-
holm, Sweden, pp. 1407–1416.
Esposito, C., Mendez, M., Gouriet, J., Steelant, J., and Vetrano, M. (2021), ‘Spectral and modal
analysis of a cavitating flow through an orifice’, Experimental Thermal and Fluid Science
121, 110251.
Evans, W. R. (1948), ‘Graphical analysis of control systems’, Transactions of the American
Institute of Electrical Engineers 67(1), 547–551.
Everitt, B. and Skrondal, A. (2002), The Cambridge Dictionary of Statistics, Vol. 106,
Cambridge University Press, Cambridge.
Ewing, D. and Citriniti, J. H. (1999), Examination of a LSE/POD complementary technique
using single and multi-time information in the axisymmetric shear layer, in ‘IUTAM
Franz, T. and Held, M. (2017), Data fusion of CFD solutions and experimental aerodynamic
data, in ‘Proceedings Onera-DLR Aerospace Symposium (ODAS)’ June 7–9, 2017, Aussois,
France.
Franz, T., Zimmermann, R., and Görtz, S. (2017), Adaptive sampling for nonlinear dimension-
ality reduction based on manifold learning, in P. Benner, M. Ohlberger, A. Patera, G. Rozza,
and K. Urban, eds., Model Reduction of Parametrized Systems, Springer, Cham, pp. 255–
269.
Franz, T., Zimmermann, R., Görtz, S., and Karcher, N. (2014), ‘Interpolation-based reduced-
order modelling for steady transonic flows via manifold learning’, International Journal of
Computational Fluid Dynamics 28(3–4), 106–121.
Frassoldati, A., Faravelli, T., and Ranzi, E. (2003), ‘Kinetic modeling of the interactions
between NO and hydrocarbons at high temperature’, Combustion and Flame 135, 97–112.
Freeman, W. T., Jones, T. R., and Pasztor, E. C. (2002), ‘Example-based super-resolution’, IEEE
Computer Graphics and Applications 22(2), 56–65.
Freymuth, P. (1966), ‘On transition in a separated laminar boundary layer’, Journal of Fluid
Mechanics 25, 683–704.
Friedman, J. (1991), ‘Multivariate adaptive regression splines’, The Annals of Statistics
19(1), 1–67.
Fu, J., Luo, K., and Levine, S. (2018), Learning robust rewards with adverserial inverse
reinforcement learning, in ‘6th International Conference on Learning Representations, ICLR
2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings’,
OpenReview.net.
Fukami, K., Fukagata, K., and Taira, K. (2018), ‘Super-resolution reconstruction of turbulent
flows with machine learning’, arXiv preprint arXiv:1811.11328.
Gabor, D. (1946), ‘Theory of communication. Part 1: The analysis of information’, Journal
of the Institution of Electrical Engineers-Part III: Radio and Communication Engineering
93(26), 429–441.
Gad-el Hak, M. (2007), Flow Control: Passive, Active, and Reactive Flow Management,
Cambridge University Press, Cambridge.
Galerkin, B. G. (1915), ‘Rods and plates: Series occuring in various questions regarding the
elastic equilibrium of rods and plates (translated)’, Vestn. Inzhen. 19, 897—908.
Galletti, G., Bruneau, C. H., Zannetti, L., and Iollo, A. (2004), ‘Low-order modelling of laminar
flow regimes past a confined square cylinder’, Journal of Fluid Mechanics 503, 161–170.
Gardiner, W. C., Lissianski, V. V., Qin, Z., Smith, G. P., Golden, D. M., Frenklach, M., Moriarty,
N. W., Eiteneer, B., Goldenberg, M., Bowman, C. T., Hanson, R. K., and Song Jr., S. (2012),
‘Gri-mech 3.0’.
Gardner, E. (1988), ‘The space of interactions in neural network models’, Journal of Physics A:
Mathematical and General 21(1), 257.
Garnier, P., Viquerat, J., Rabault, J., Larcher, A., Kuhnle, A., and Hachem, E. (2019), ‘A review
on deep reinforcement learning for fluid mechanics’, arXiv preprint arXiv:1908.04127.
Gaster, M., Kit, E., and Wygnanski, I. (1985), ‘Large-scale structures in a forced turbulent
mixing layer’, Journal of Fluid Mechanics 150, 23–39.
Gautier, N., Aider, J.-L., Duriez, T., Noack, B., Segond, M., and Abel, M. (2015), ‘Closed-loop
separation control using machine learning’, Journal of Fluid Mechanics 770, 442–457.
Gazzola, M., Hejazialhosseini, B., and Koumoutsakos, P. (2014), ‘Reinforcement learning and
wavelet adapted vortex methods for simulations of self-propelled swimmers’, SIAM Journal
on Scientific Computing 36(3), B622–B639.
Gazzola, M., Tchieu, A. A., Alexeev, D., de Brauer, A., and Koumoutsakos, P. (2016), ‘Learning
to school in the presence of hydrodynamic interactions’, Journal of Fluid Mechanics
789, 726–749.
Gerhard, J., Pastoor, M., King, R., Noack, B. R., Dillmann, A., Morzyński, M., and Tadmor, G.
(2003), Model-based control of vortex shedding using low-dimensional Galerkin models, in
‘33rd AIAA Fluids Conference and Exhibit’, Orlando, FL, June 23–26, 2003. Paper 2003-
4262.
Germano, M., Piomelli, U., Moin, P., and Cabot, W. H. (1991), ‘A dynamic subgrid-scale eddy
viscosity model’, Physics of Fluids A: Fluid Dynamics 3(7), 1760–1765.
Giannetti, F. and Luchini, P. (2007), ‘Structural sensitivity of the first instability of the cylinder
wake’, Journal of Fluid Mechanics 581, 167–197.
Glaz, B., Liu, L., and Friedmann, P. P. (2010), ‘Reduced-order nonlinear unsteady aerodynamic
modeling using a surrogate-based recurrence framework’, AIAA Journal 48(10), 2418–2429.
Goldstein, T. and Osher, S. (2009), ‘The split Bregman method for L1 -regularized problems’,
SIAM Journal on Imaging Sciences 2(2), 323–343.
Gonzalez-Garcia, R., Rico-Martinez, R., and Kevrekidis, I. (1998), ‘Identification of distributed
parameter systems: A neural net based approach’, Computers & Chemical Engineering
22, S965–S968.
Gonzalez, R. C. and Woods, R. E. (2017), Digital Image Processing, 4th ed., Pearson, New
York.
Goodfellow, I., Bengio, Y., and Courville, A. (2016), Deep Learning, MIT Press, Cambridge,
MA.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville,
A., and Bengio, Y. (2014), Generative adversarial nets, in ‘Advances in Neural Information
Processing Systems’, December 8–13, 2014, Montreal, Canada, pp. 2672–2680.
Görtz, S., Ilic, C., Abu-Zurayk, M., Liepelt, R., Jepsen, J., Führer, T., Becker, R., Scherer,
J., Kier, T., and Siggel, M. (2016), Collaborative multi-level MDO process development
and application to long-range transport aircraft, in ‘Proceedings of the 30th Congress of
the International Council of the Aeronautical Sciences (ICAS)’, September 25–30, 2016,
Daejeon, Korea.
Gray, R. M. (2005), ‘Toeplitz and circulant matrices: A review’, Foundations and Trends
R
in
Communications and Information Theory 2(3), 155–239.
Greydanus, S., Dzamba, M., and Yosinski, J. (2019), ‘Hamiltonian neural networks’, Advances
in Neural Information Processing Systems 32, 15379–15389.
Griewank, A. and Walther, A. (2000), ‘Algorithm 799: Revolve: An implementation of
checkpointing for the reverse or adjoint mode of computational differentiation’, ACM
Transactions on Mathematical Software 26(1), 19–45.
Grossmann, A. and Morlet, J. (1984), ‘Decomposition of hardy functions into square integrable
wavelets of constant shape’, SIAM Journal on Mathematical Analysis 15(4), 723–736.
Gu, S., Lillicrap, T., Sutskever, I., and Levine, S. (2016), Continuous deep Q-learning with
model-based acceleration, Vol. 48 of Proceedings of Machine Learning Research, PMLR,
New York, pp. 2829–2838.
Guan, Y., Brunton, S. L., and Novosselov, I. (2020), ‘Sparse nonlinear models of chaotic
electroconvection’, arXiv preprint arXiv:2009.11862.
Guastoni, L., Güemes, A., Ianiro, A., Discetti, S., Schlatter, P., Azizpour, H., and Vinuesa,
R. (2020), ‘Convolutional-network models to predict wall-bounded turbulence from wall
quantities’, arXiv preprint arXiv:2006.12483.
Hasselt, H. v., Guez, A., and Silver, D. (2016), Deep reinforcement learning with double
Q-learning, in ‘Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence’,
AAAI 16, February 12–17, 2016, Phoenix, AZ, AAAI Press, Palo Alto, CA, pp. 2094–2100.
Hastie, T., Tibshirani, R., and Friedman, J. (2009), The Elements of Statistical Learning, Vol. 2,
Springer, New York.
Hayes, M. (2011), Schaums Outline of Digital Signal Processing, McGraw-Hill Education, New
York.
Hemati, M., Rowley, C., Deem, E., and Cattafesta, L. (2017), ‘De-biasing the dynamic mode
decomposition for applied Koopman spectral analysis’, Theoretical and Computational
Fluid Dynamics 31(4), 349–368.
Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., and Meger, D. (2018), Deep
reinforcement learning that matters, in S. A. McIlraith and K. Q. Weinberger, eds.,
‘Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18),
the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI
Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans,
Louisiana, USA, February 2–7, 2018’, AAAI Press, Palo Alto, CA, pp. 3207–3214.
Henningson, D. (2010), ‘Description of complex flow behaviour using global and dynamic
modes’, Journal of Fluid Mechanics 656, 1–4.
Herbert, T. (1988), ‘Secondary instability of boundary layers’, Annual Review of Fluid
Mechanics 20, 487–526.
Hernán, M. A. and Jiménez, J. (1982), ‘Computer analysis of a high-speed film of the plane
turbulent mixing layer’, Journal of Fluid Mechanics 119, 323–345.
Hervé, A., Sipp, D., Schmid, P. J. and Samuelides, M. (2012), ‘A physics-based approach to
flow control using system identification’, Journal of Fluid Mechanics 702, 26–58.
Hill, D. (1995), ‘Adjoint systems and their role in the receptivity problem for boundary layers’,
Journal of Fluid Mechanics 292, 183–204.
Hinton, G. E. (2006), ‘Reducing the dimensionality of data with neural networks’, Science
313(5786), 504–507.
Hinton, G. E. and Sejnowski, T. J. (1986), ‘Learning and releaming in Boltzmann machines’,
Parallel Distributed Processing: Explorations in the Microstructure of Cognition 1(282–
317), 2.
Hinze, M. and Kunisch, K. (2000), ‘Three control methods for time-dependent fluid flow’, Flow,
Turbulence and Combustion 65, 273–298.
Ho, B. L. and Kalman, R. E. (1965), Effective construction of linear state-variable models from
input/output data, in ‘Proceedings of the 3rd Annual Allerton Conference on Circuit and
System Theory’, October 20–22, 1965, Monticello, IL, pp. 449–459.
Ho, J. and Ermon, S. (2016), Generative adversarial imitation learning, in D. D. Lee,
M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds., ‘Advances in Neural
Information Processing Systems 29’, December 5–10, 2016, Barcelona, Spain, Curran
Associates, Red Hook, NY, pp. 4565–4573.
Hochbruck, M. and Ostermann, A. (2010), ‘Exponential integrators’, Acta Numerica 19, 209–
286.
Hochreiter, S. and Schmidhuber, J. (1997), ‘Long short-term memory’, Neural Computation
9(8), 1735–1780.
Hof, B., Van Doorne, C. W., Westerweel, J., Nieuwstadt, F. T., Faisst, H., Eckhardt, B., Wedin,
H., Kerswell, R. R., and Waleffe, F. (2004), ‘Experimental observation of nonlinear traveling
waves in turbulent pipe flow’, Science 305(5690), 1594–1598.
Hoffmann, M., Fröhner, C., and Noé, F. (2019), ‘Reactive SINDy: Discovering governing
reactions from concentration data’, Journal of Chemical Physics 150, 025101.
Holland, J. H. (1975), Adaptation in Natural and Artificial Systems: An Introductory Analysis
with Applications to Biology, Control, and Artificial Intelligence, University of Michigan
Press, Ann Arbor.
Holmes, P., Lumley, J. L., Berkooz, G., and Rowley, C. W. (2012), Turbulence, Coherent
Structures, Dynamical Systems and Symmetry, 2nd paperback ed., Cambridge University
Press, Cambridge.
Hopf, E. (1948), ‘A mathematical example displaying features of turbulence’, Communications
on Pure and Applied Mathematics 1(4), 303–322.
Hopfield, J. J. (1982), ‘Neural networks and physical systems with emergent collective
computational abilities’, Proceedings of the National Academy of Sciences 79(8), 2554–
2558.
Horn, R. A. and Johnson, C. R. (2012), Matrix Analysis, Cambridge University Press,
Cambridge.
Hornik, K., Stinchcombe, M., and White, H. (1989), ‘Multilayer feedforward networks are
universal approximators’, Neural Networks 2(5), 359–366.
Hosseini, Z., Martinuzzi, R. J., and Noack, B. R. (2015), ‘Sensor-based estimation of the
velocity in the wake of a low-aspect-ratio pyramid’, Experiments in Fluids 56(1), 13.
Hou, W., Darakananda, D., and Eldredge, J. (2019), Machine learning based detection of flow
disturbances using surface pressure measurements, in ‘AIAA Scitech 2019 Forum’, January
7–11, 2019, San Diego, CA, p. 1148.
Hoyas, S. and Jiménez, J. (2006), ‘Scaling of the velocity fluctuations in turbulent channels up
to Reτ = 2003’, Physics of Fluids 18, 011702.
Hsu, H. (2013), Schaum’s Outline of Signals and Systems, 3rd ed. (Schaum’s Outlines),
McGraw-Hill Education, New York.
Huang, Z., Du, W., and Chen, B. (2005), Deriving private information from randomized data,
in ‘Proceedings of the 2005 ACM SIGMOD International Conference on Management of
Data’, June 14–16, 2005, Baltimore, MD, pp. 37–48.
Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., and Hutter,
M. (2019), ‘Learning agile and dynamic motor skills for legged robots’, arXiv preprint
arXiv:1901.08652.
Illingworth, S. J. (2016), ‘Model-based control of vortex shedding at low Reynolds numbers’,
Theoretical and Computational Fluid Dynamics 30(5), 1–20.
Illingworth, S. J., Morgans, A. S., and Rowley, C. W. (2010), ‘Feedback control of flow reso-
nances using balanced reduced-order models’, Journal of Sound and Vibration 330(8), 1567–
1581.
Illingworth, S. J., Morgans, A. S., and Rowley, C. W. (2012), ‘Feedback control of cavity flow
oscillations using simple linear models’, Journal of Fluid Mechanics 709, 223–248.
Illingworth, S. J., Naito, H., and Fukagata, K. (2014), ‘Active control of vortex shedding: An
explanation of the gain window’, Physical Review E 90(4), 043014.
Ingle, V. K. and Proakis, J. G. (2011), Digital Signal Processing Using MATLAB, Cengage
Learning, Boston.
Isaac, B. J., Coussement, A., Gicquel, O., Smith, P. J., and Parente, A. (2014), ‘Reduced-order
PCA models for chemical reacting flows’, Combustion and Flame 161(11), 2785–2800.
Isaac, B. J., Thornock, J. N., Sutherland, J., Smith, P. J., and Parente, A. (2015), ‘Advanced
regression methods for combustion modelling using principal components’, Combustion and
Flame 162(6), 2592–2601.
Jackson, C. P. (1987), ‘A finite-element study of the onset of vortex shedding in flow past
variously shaped bodies’, Journal of Fluid Mechanics 182, 23–45.
Jaderberg, M., Mnih, V., Czarnecki, W. M., Schaul, T., Leibo, J. Z., Silver, D., and Kavukcuoglu,
K. (2017), Reinforcement learning with unsupervised auxiliary tasks, in ‘5th International
Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017,
Conference Track Proceedings’, OpenReview.net.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013), An Introduction to Statistical
Learning, Springer, New York.
Jho, H. (2018), ‘Beautiful physics: Re-vision of aesthetic features of science through the
literature review’, The Journal of the Korean Physical Society 73, 401–413.
Jiménez, J. (2002), Turbulence in ‘Perspectives in Fluid Dynamics’, Cambridge University
Press, Cambridge.
Jiménez, J. (2013), ‘How linear is wall-bounded turbulence?’, Physics of Fluids 25, 110814.
Jiménez, J. (2018a), ‘Coherent structures in wall-bounded turbulence’, Journal of Fluid
Mechanics 842, P1.
Jiménez, J. (2018b), ‘Machine-aided turbulence theory’, Journal of Fluid Mechanics 854, R1.
Jiménez, J. (2020a), ‘Computers and turbulence’, The European Journal of Mechanics -
B/Fluids 79, 1–11.
Jiménez, J. (2020b), ‘Monte Carlo science’, Journal of Turbulence 21(9–10), 544–566.
Jiménez, J., Kawahara, G., Simens, M. P., Nagata, M., and Shiba, M. (2005), ‘Characterization
of near-wall turbulence in terms of equilibrium and “bursting” solutions’, Physics of Fluids
17, 015105.
Jiménez, J. and Moin, P. (1991), ‘The minimal flow unit in near-wall turbulence’, Journal of
Fluid Mechanics 225, 213–240.
Jiménez, J. and Pinelli, A. (1999), ‘The autonomous cycle of near-wall turbulence’, Journal of
Fluid Mechanics 389, 335–359.
Jiménez, J. and Simens, M. P. (2001), ‘Low-dimensional dynamics of a turbulent wall flow’,
Journal of Fluid Mechanics 435, 81–91.
Jiménez, J., Wray, A. A., Saffman, P. G., and Rogallo, R. S. (1993), ‘The structure of intense
vorticity in isotropic turbulence’, Journal of Fluid Mechanics 255, 65–90.
Johannink, T., Bahl, S., Nair, A., Luo, J., Kumar, A., Loskyll, M., Ojea, J. A., Solowjow, E., and
Levine, S. (2019), Residual reinforcement learning for robot control, in ‘2019 International
Conference on Robotics and Automation (ICRA)’, May 20–24, 2019, Montreal, Canada,
IEEE, pp. 6023–6029.
Jolliffe, I. (2002), Principal Component Analysis, Springer-Verlag, New York.
Jordan, D. and Smith, P. (1988), Nonlinear Ordinary Differential Equations, Clarendon Press,
Oxford.
Joseph, D. (1976), Stability of Fluid Motions I, Springer Verlag, New York.
Joseph, D. and Carmi, S. (1969), ‘Stability of Poiseuille flow in pipes, annuli and channels’,
Quarterly of Applied Mathematics 26, 575–599.
Jovanović, M. R., Schmid, P. J., and Nichols, J. W. (2014), ‘Sparsity-promoting dynamic mode
decomposition’, Physics of Fluids 26(2), 024103.
Juang, J. N. (1994), Applied System Identification, Prentice Hall PTR, Upper Saddle River.
Juang, J. N. and Pappa, R. S. (1985), ‘An eigensystem realization algorithm for modal parameter
identification and model reduction’, Journal of Guidance, Control, and Dynamics 8(5), 620–
627.
Juang, J. N., Phan, M., Horta, L. G., and Longman, R. W. (1991), Identification of
observer/Kalman filter Markov parameters: Theory and experiments, Technical Memoran-
dum 104069, NASA.
Kaiser, E., Kutz, J. N., and Brunton, S. L. (2018), ‘Sparse identification of nonlinear dynamics
for model predictive control in the low-data limit’, Proceedings of the Royal Society of
London A 474(2219), 20180335.
Kaiser, E., Li, R., and Noack, B. R. (2017), On the control landscape topology, in ‘The 20th
World Congress of the International Federation of Automatic Control (IFAC)’, Toulouse,
France, pp. 1–4.
Kaiser, E., Noack, B. R., Cordier, L., Spohn, A., Segond, M., Abel, M., Daviller, G., Osth, J.,
Krajnovic, S., and Niven, R. K. (2014), ‘Cluster-based reduced-order modelling of a mixing
layer’, Journal of Fluid Mechanics 754, 365–414.
Kaiser, E., Noack, B. R., Spohn, A., Cattafesta, L. N., and Morzyński, M. (2017a), ‘Cluster-
based control of a separating flow over a smoothly contoured ramp’, Theoretical and
Computational Fluid Dynamics 31(5–6), 579–593.
Kaiser, E., Noack, B. R., Spohn, A., Cattafesta, L. N., and Morzyński, M. (2017b), ‘Cluster-
based control of nonlinear dynamics’, Theoretical and Computational Fluid Dynamics 31(5–
6), 1579–593.
Kaiser, G. (2010), A Friendly Guide to Wavelets, Springer Science & Business Media, Boston.
Kaiser, H. F. (1958), ‘The varimax criterion for analytic rotation in factor analysis’, Psychome-
trika 23(3), 187–200.
Kambhatla, N. and Leen, T. K. (1997), ‘Dimension reduction by local principal component
analysis’, Neural Computation 9(7), 1493–1516.
Kaptanoglu, A. A., Morgan, K. D., Hansen, C. J., and Brunton, S. L. (2020),
‘Physics-constrained, low-dimensional models for MHD: First-principles and data-driven
approaches’, arXiv preprint arXiv:2004.10389 .
Kawahara, G. (2005), ‘Laminarization of minimal plane Couette flow: Going beyond the basin
of attraction of turbulence’, Physics of Fluids 17, 041702.
Kawahara, G., Uhlmann, M., and van Veen, L. (2012), ‘The significance of simple invariant
solutions in turbulent flows’, Annual Review of Fluid Mechanics 44, 203–225.
Kee, R. J., Coltrin, M. E., and Glarborg, P. (2005), Chemically Reacting Flow: Theory and
Practice, John Wiley & Sons, Hoboken, NJ.
Keefe, L., Moin, P., and Kim, J. (1992), ‘The dimension of attractors underlying periodic
turbulent Poiseuille flow’, Journal of Fluid Mechanics 242, 1–29.
Kenney, J. and Keeping, E. (1951), Mathematics of Statistics, Vol. II, D. Van Nostrand Co., New
York.
Kerhervé, F., Roux, S., and Mathis, R. (2017), ‘Combining time-resolved multi-point and
spatially-resolved measurements for the recovering of very-large-scale motions in high
Reynolds number turbulent boundary layer’, Experimental Thermal and Fluid Science
82, 102–115.
Kerstein, A. R. (1999), ‘One-dimensional turbulence: Model formulation and application
to homogeneous turbulence, shear flows, and buoyant stratified flows’, Journal of Fluid
Mechanics 392, 277–334.
Kerstens, W., Pfeiffer, J., Williams, D., King, R., and Colonius, T. (2011), ‘Closed-loop
control of lift for longitudinal gust suppression at low Reynolds numbers’, AIAA Journal
49(8), 1721–1728.
Khalil (2002), Nonlinear Systems, 3rd ed., Dover, I, New York.
Khalil, H. K. and Grizzle, J. W. (2002), Nonlinear Systems, Vol. 3, Prentice Hall, Upper Saddle
River, NJ.
Kim, H. J., Jordan, M. I., Sastry, S., and Ng, A. Y. (2004), Autonomous helicopter flight via
reinforcement learning, in ‘Advances in Neural Information Processing Systems’, December
8–13, 2003, Vancouver and Whistler, British Columbia, Canada, pp. 799–806.
Kim, H. T., Kline, S. J., and Reynolds, W. C. (1971), ‘The production of turbulence near a
smooth wall in a turbulent boundary layer’, Journal of Fluid Mechanics 50, 133–160.
Kim, J. (2011), ‘Physics and control of wall turbulence for drag reduction’, Philosophical
Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
369(1940), 1396–1411.
Kim, J. and Bewley, T. (2007), ‘A linear systems approach to flow control’, Annual Review of
Fluid Mechanics 39, 383–417.
King, P., Rowland, J., Aubrey, W., Liakata, M., Markham, M., Soldatova, L. N., Whelan, K. E.,
Clare, A., Young, M., Sparkes, A., Oliver, S. G., and Pir, P. (2009), ‘The robot scientist
Adam’, Computer 42(7), 46–54.
Kingma, D. P. and Ba, J. (2014), ‘Adam: A method for stochastic optimization’, arXiv preprint
arXiv:1412.6980.
Kline, S. J., Reynolds, W. C., Schraub, F. A., and Runstadler, P. W. (1967), ‘Structure of
turbulent boundary layers’, Journal of Fluid Mechanics 30, 741–773.
Knaak, M., Rothlubbers, C. and Orglmeister, R. (1997), A Hopfield neural network for flow
field computation based on particle image velocimetry/particle tracking velocimetry image
sequences, in ‘International Conference on Neural Networks, 1997’, Vol. 1, October 8–10,
1997, Lausanne, Switzerland, IEEE, pp. 48–52.
Knight, W. (2018), ‘Google just gave control over data center cooling to an AI’,
www.technologyreview.com/s/611902/google-just-gave-control-over-data-
center-cooling-to-an-ai/.
Koch, W., Bertolotti, F., Stolte, A., and Hein, S. (2000), ‘Nonlinear equilibrium solutions
in a three-dimensional boundary layer and their secondary instability’, Journal of Fluid
Mechanics 406, 131–174.
Kochenderfer, M. and Wheeler, T. (2019), Algorithms for Optimization, MIT Press, Cambridge,
MA.
Kolmogorov, A. N. (1941), ‘The local structure of turbulence in incompressible viscous fluid
for very large Reynolds numbers’, Doklady Akademii Nauk SSSR 30, 209–303.
Kot, M. (2015), A First Course in the Calculus of Variations, American Mathematical Society,
Providence, RI.
Koza, J. R. (1992), Genetic Programming: On the Programming of Computers by Means of
Natural Selection, The MIT Press, Boston.
Kraichnan, R. H. (1967), ‘Inertial ranges in two-dimensional turbulence’, Physics of Fluids
10, 1417–1423.
Kraichnan, R. H. (1971), ‘Inertial range transfer in two- and three-dimensional turbulence’,
Journal of Fluid Mechanics 47, 525–535.
Kraichnan, R. H. and Montgomery, D. (1980), ‘Two-dimensional turbulence’, Reports on
Progress in Physics 43, 547–619.
Kramer, B., Grover, P., Boufounos, P., Benosman, M., and Nabi, S. (2015), ‘Sparse sensing and
DMD based identification of flow regimes and bifurcations in complex flows’, arXiv preprint
arXiv:1510.02831.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012), Imagenet classification with deep
convolutional neural networks, in ‘Advances in Neural Information Processing Systems’,
December 3–6, 2012, Lake Tahoe, NV, pp. 1097–1105.
Kroll, N., Abu-Zurayk, M., Dimitrov, D., Franz, T., Führer, T., Gerhold, T., Görtz, S., Heinrich,
R., Ilic, C., Jepsen, J., Jägersküpper, J., Kruse, M., Krumbein, A., Langer, S., Liu, D.,
Liepelt, R., Reimer, L., Ritter, M., Schwöppe, A., Scherer, J., Spiering, F., Thormann,
R., Togiti, V., Vollmer, D., and Wendisch, J.-H. (2015), ‘DLR project digital-x: Towards
virtual aircraft design and flight testing based on high-fidelity methods’, CEAS Aeronautical
Journal 7(1), 3–27.
Kroll, N., Langer, S., and Schwöppe, A. (2014), The DLR flow solver TAU – Status and recent
algorithmic developments, in ‘52nd Aerospace Sciences Meeting’, January 3–17, 2014,
National Harbor, MD, American Institute of Aeronautics and Astronautics, Reston, VA.
Kuhn, T. S. (1970), The Structure of Scientific Revolutions, 2nd ed., Chicago University Press,
Chicago.
Kuhnle, A., Schaarschmidt, M., and Fricke, K. (2017), ‘Tensorforce: a TensorFlow library for
applied reinforcement learning’, https://tensorforce.readthedocs.io/en/latest/#.
Kuipers, L. and Niederreiter, H. (2005), Uniform Distribution of Sequences, Dover, Mineola,
NY, p. 129.
Kuo, K. and Acharya, R. (2012), Fundamentals of Turbulent and Multi-Phase Combustion,
Wiley, New York.
Kutz, J., Brunton, S., Brunton, B., and Proctor, J. (2016), Dynamic Mode Decomposition: Data-
Driven Modeling of Complex Systems, Society for Industrial and Applied Mathematics,
Philadelphia.
Kutz, J. N. (2017), ‘Deep learning in fluid dynamics’, Journal of Fluid Mechanics 814, 1–4.
Kutz, J. N., Fu, X., and Brunton, S. L. (2016), ‘Multiresolution dynamic mode decomposition’,
SIAM Journal on Applied Dynamical Systems 15(2), 713–735.
Labonté, G. (1999), ‘A new neural network for particle-tracking velocimetry’, Experiments in
Fluids 26(4), 340–346.
Lagaris, I. E., Likas, A., and Fotiadis, D. I. (1998), ‘Artificial neural networks for solving ordi-
nary and partial differential equations’, IEEE Transactions on Neural Networks 9(5), 987–
1000.
Lai, Z. and Nagarajaiah, S. (2019), ‘Sparse structural system identification method for
nonlinear dynamic systems with hysteresis/inelastic behavior’, Mechanical Systems and
Signal Processing 117, 813–842.
Lall, S., Marsden, J. E., and Glavaški, S. (1999), ‘Empirical model reduction of controlled
nonlinear systems,’ International Federation of Automatic Control 32(2), 2598–2603.
Lall, S., Marsden, J. E., and Glavaški, S. (2002), ‘A subspace approach to balanced truncation
for model reduction of nonlinear control systems’, International Journal of Robust and
Nonlinear Control 12(6), 519–535.
Lan, Y. and Mezić, I. (2013), ‘Linearization in the large of nonlinear systems and Koopman
operator spectrum’, Physica D: Nonlinear Phenomena 242(1), 42–53.
Landau, L. D. (1944), ‘On the problem of turbulence,’ C.R. Acad. Sci. USSR 44, 311–314.
Landau, L. D. and Lifshitz, E. M. (1987), Fluid Mechanics, Vol. 6 in Course of Theoretical
Physics, 2nd English ed., Pergamon Press, Oxford.
Lasota, A. and Mackey, M. (1994), Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics,
Springer Verlag, New York.
Law, C. K. (2010), Combustion Physics, Cambridge University Press, Cambridge.
Le Clainche, S. and Vega, J. M. (2017), ‘Higher order dynamic mode decomposition to identify
and extrapolate flow patterns’, Physics of Fluids 29(8), 084102.
Leclercq, C., Demourant, F., Poussot-Vassal, C., and Sipp, D. (2019), ‘Linear iterative method
for closed-loop control of quasiperiodic flows’, Journal of Fluid Mechanics 868, 26–65.
LeCun, Y., Bengio, Y., and Hinton, G. (2015), ‘Deep learning’, Nature 521(7553), 436–444.
Lee, C., Kim, J., Babcock, D., and Goodman, R. (1997), ‘Application of neural networks to
turbulence control for drag reduction’, Physics of Fluids 9(6), 1740–1747.
Lee, J.-H. and Sung, H. J. (2013), ‘Comparison of very-large-scale motions of turbulent pipe
and boundary layer simulations’, Physics of Fluids 25, 045103.
Lee, Y., Yang, H., and Yin, Z. (2017), ‘PIV-DCNN: Cascaded deep convolutional neural
networks for particle image velocimetry’, Experiments in Fluids 58(12), 171.
Leonard, A. (1980), ‘Vortex methods for flow simulation’, Journal of Computational Physics
37(3), 289–335.
Li, H., Fernex, D., Semaan, R., Tan, J., Morzyński, M., and Noack, B. R. (2021), ‘Cluster-based
network model’, Journal of Fluid Mechanics 906, A21:1–41.
Li, Q., Dietrich, F., Bollt, E. M., and Kevrekidis, I. G. (2017), ‘Extended dynamic mode
decomposition with dictionary learning: A data-driven adaptive spectral decomposition
of the Koopman operator’, Chaos: An Interdisciplinary Journal of Nonlinear Science
27(10), 103111.
Li, R., Noack, B. R., Cordier, L., Borée, J., Kaiser, E., and Harambat, F. (2018), ‘Linear genetic
programming control for strongly nonlinear dynamics with frequency crosstalk’, Archives of
Mechanics 70(6), 505–534.
Li, Y., Perlman, E., Wan, M., Yang, Y., Meneveau, C., Burns, R., Chen, S., Szalay, A., and
Eyink, G. (2008), ‘A public turbulence database cluster and applications to study Lagrangian
evolution of velocity increments in turbulence’, Journal of Turbulence 9, N31.
Liberty, E., Woolfe, F., Martinsson, P.-G., Rokhlin, V., and Tygert, M. (2007), ‘Randomized
algorithms for the low-rank approximation of matrices’, Proceedings of the National
Academy of Sciences 104(51), 20167–20172.
Lin, C.-J. (2007), ‘Projected gradient methods for nonnegative matrix factorization’, Neural
Computation 19(10), 2756–2779.
Ling, J., Jones, R., and Templeton, J. (2016), ‘Machine learning strategies for systems with
invariance properties’, Journal of Computational Physics 318, 22–35.
Ling, J., Kurzawski, A., and Templeton, J. (2016), ‘Reynolds averaged turbulence modelling
using deep neural networks with embedded invariance’, Journal of Fluid Mechanics
807, 155–166.
Ling, J. and Templeton, J. (2015), ‘Evaluation of machine learning algorithms for prediction
of regions of high Reynolds averaged Navier Stokes uncertainty’, Physics of Fluids
27(8), 085103.
Liu, J. T. C. (1989), ‘Coherent structures in transitional and turbulent free shear flows’, Annual
Review of Fluid Mechanics 21(1), 285–315.
Ljung, L. (1999), System Identification: Theory for the User, Prentice Hall, Upper Saddle River,
NJ.
Ljung, L. (2008), ‘Perspectives on system identification’, IFAC Proceedings Volumes
41(2), 7172–7184.
Ljung, L. and Glad, T. (1994), Modeling of Dynamic Systems, Prentice Hall, Englewood Cliffs,
NJ.
Loan, C. V. (1992), Computational Frameworks for the Fast Fourier Transform, Society for
Industrial and Applied Mathematics (SIAM), Philadelphia.
Loiseau, J.-C. (2019), ‘Data-driven modeling of the chaotic thermal convection in an annular
thermosyphon’, arXiv preprint arXiv:1911.07920.
Loiseau, J.-C. and Brunton, S. L. (2018), ‘Constrained sparse Galerkin regression’, Journal of
Fluid Mechanics 838, 42–67.
Loiseau, J.-C., Brunton, S. L., and Noack, B. R. (2021), From the POD-Galerkin method to
sparse manifold models, in P. Benner, S. Grivet-Talocaia, A. Quarteroni, R. G., W. Schilders,
and L. M. Silveria, eds., Handbook of Model-Order Reduction. Volume 3: Applications, De
Gruyter, Berlin, pp. 279–320.
Loiseau, J.-C., Noack, B. R., and Brunton, S. L. (2018), ‘Sparse reduced-order modeling:
Sensor-based dynamics to full-state estimation’, Journal of Fluid Mechanics 844, 459–490.
Lorenz, E. (1956), Empirical orthogonal functions and statistical weather prediction, Statistical
forecasting project, Scientific Report No. 1, MIT, Department of Meteorology, Cambridge,
MA.
Lorenz, E. N. (1963), ‘Deterministic nonperiodic flow’, Journal of the Atmospheric Sciences
20(2), 130–141.
Lozano-Durán, A., Flores, O., and Jiménez, J. (2012), ‘The three-dimensional structure of
momentum transfer in turbulent channels’, Journal of Fluid Mechanics 694, 100–130.
Lozano-Durán, A. and Jiménez, J. (2014), ‘Time-resolved evolution of coherent structures in
turbulent channels: Characterization of eddies and cascades’, Journal of Fluid Mechanics
759, 432–471.
Lu, L., Meng, X., Mao, Z., and Karniadakis, G. E. (2019), ‘Deepxde: A deep learning library
for solving differential equations’, arXiv preprint arXiv:1907.04502.
Lu, S. S. and Willmarth, W. W. (1973), ‘Measurements of the structure of the Reynolds stress
in a turbulent boundary layer’, Journal of Fluid Mechanics 60, 481–511.
Lu, T. and Law, C. K. (2009), ‘Toward accommodating realistic fuel chemistry in large-scale
computations’, Progress in Energy and Combustion Science 35(2), 192–215.
Luchtenburg, D. M., Günter, B., Noack, B. R., King, R., and Tadmor, G. (2009), ‘A generalized
mean-field model of the natural and actuated flows around a high-lift configuration’, Journal
of Fluid Mechanics 623, 283–316.
Luchtenburg, D. M. and Rowley, C. W. (2011), ‘Model reduction using snapshot-based
realizations’, Bulletin of the American Physical Society 56(18), 37pp.
Luchtenburg, D. M., Schlegel, M., Noack, B. R., Aleksić, K., King, R., Tadmor, G., and
Günther, B. (2010), Turbulence control based on reduced-order models and nonlinear
control design, in R. King, ed., Active Flow Control II, Vol. 108 of Notes on Numerical
Fluid Mechanics and Multidisciplinary Design, May 26–28, 2010, Springer-Verlag, Berlin,
pp. 341–356.
Luhar, M., Sharma, A. S., and McKeon, B. J. (2014), ‘Opposition control within the resolvent
analysis framework’, Journal of Fluid Mechanics 749, 597–626.
Lumley, J. (1970), Stochastic Tools in Turbulence, Academic Press, New York.
Lumley, J. L. (1967), The structure of inhomogeneous turbulent flows, in A. M. Yaglam and V. I.
Tatarsky, eds., ‘Proceedings of the International Colloquium on the Fine Scale Structure of
the Atmosphere and Its Influence on Radio Wave Propagation’, Doklady Akademii Nauk
SSSR, Moscow, Nauka.
Lumley, J. L. and Poje, A. (1997), ‘Low-dimensional models for flows with density fluctua-
tions’, Physics of Fluids 9(7), 2023–2031.
Lusch, B., Kutz, J. N., and Brunton, S. L. (2018), ‘Deep learning for universal linear embeddings
of nonlinear dynamics’, Nature Communications 9(1), 4950.
Lyapunov, A. (1892), The General Problem of the Stability of Motion, The Kharkov Mathemat-
ical Society, Kharkov, 251pp.
Ma, Z., Ahuja, S., and Rowley, C. W. (2011), ‘Reduced order models for control of fluids using
the eigensystem realization algorithm’, Theoretical and Computational Fluid Dynamics
25(1), 233–247.
Mackie, J. (1974), The Cement of the Universe: A Study of Causation, Oxford University Press,
New York.
MacQueen, J. (1967), Some methods for classification and analysis of multivariate observations,
in ‘Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probabil-
ity’, Vol. 1, Oakland, CA, pp. 281–297.
Magill, J., Bachmann, M., and Rixon, G. (2003), ‘Dynamic stall control using a model-based
observer.’, Journal of Aircraft 40(2), 355–362.
Magnussen, B. F. (1981), On the structure of turbulence and a generalized eddy dissipation
concept for chemical reaction in turbulent flow, in ‘19th AIAA Aerospace Science Meeting’,
May 21–24, 2007, Williamsburg, VA.
Magri, L. (2019), ‘Adjoint methods as design tools in thermoacoustics’, Applied Mechanics
Reviews 71(2), 020801.
Mahoney, M. W. (2011), ‘Randomized algorithms for matrices and data’, Foundations and
Trends in Machine Learning 3, 123–224.
Majda, A. J. and Harlim, J. (2012), ‘Physics constrained nonlinear regression models for time
series’, Nonlinearity 26(1), 201.
Malik, M. R., Isaac, B. J., Coussement, A., Smith, P. J., and Parente, A. (2018), ‘Principal
component analysis coupled with nonlinear regression for chemistry reduction’, Combustion
and Flame 187, 30–41.
Malik, M. R., Obando Vega, P., Coussement, A., and Parente, A. (2020), ‘Combustion modeling
using Principal Component Analysis: A posteriori validation on Sandia flames D, E and F’,
Proceedings of the Combustion Institute 38(2), 2635–2643.
Mallat, S. (2009), A Wavelet Tour of Signal Processing, Elsevier, Oxford.
Mallat, S. G. (1989), ‘Multiresolution approximations and wavelet orthonormal bases of
L 2 (R)’, Transactions of the American Mathematical Society 315(1), 69–87.
Mallor, F., Raiola, M., Vila, C. S., Örlü, R., Discetti, S., and Ianiro, A. (2019), ‘Modal
decomposition of flow fields and convective heat transfer maps: An application to wall-
proximity square ribs’, Experimental Thermal and Fluid Science 102, 517–527.
Maltrud, M. E. and Vallis, G. K. (1991), ‘Energy spectra and coherent structures in forced two-
dimensional and beta-plane turbulence’, Journal of Fluid Mechanics 228, 321–342.
Mangan, N. M., Brunton, S. L., Proctor, J. L., and Kutz, J. N. (2016), ‘Inferring biological
networks by sparse identification of nonlinear dynamics’, IEEE Transactions on Molecular,
Biological, and Multi-Scale Communications 2(1), 52–63.
Mangan, N. M., Kutz, J. N., Brunton, S. L., and Proctor, J. L. (2017), ‘Model selection for
dynamical systems via sparse regression and information criteria’, Proceedings of the Royal
Society A 473(2204), 1–16.
Manohar, K., Brunton, B. W., Kutz, J. N., and Brunton, S. L. (2018), ‘Data-driven sparse sensor
placement: Demonstrating the benefits of exploiting known patterns’, IEEE Control Systems
Magazine 38(3), 63–86.
Marčenko, V. A. and Pastur, L. A. (1967), ‘Distribution of eigenvalues for some sets of random
matrices’, Mathematics of the USSR-Sbornik 1(4), 457.
Mardt, A., Pasquali, L., Wu, H., and Noé, F. (2018), ‘VAMPnets: Deep learning of molecular
kinetics’, Nature Communications 9, 5.
Marsden, J. E. and Ratiu, T. S. (1999), Introduction to Mechanics and Symmetry, 2nd ed.,
Springer-Verlag, New York.
Maulik, R., San, O., Rasheed, A., and Vedula, P. (2019), ‘Subgrid modelling for two-
dimensional turbulence using neural networks’, Journal of Fluid Mechanics 858, 122–144.
Maurel, S., Borée, J., and Lumley, J. (2001), ‘Extended proper orthogonal decomposition:
Application to jet/vortex interaction’, Flow, Turbulence and Combustion 67(2), 125–136.
McKeon, B. and Sharma, A. (2010), ‘A critical-layer framework for turbulent pipe flow’,
Journal of Fluid Mechanics 658, 336–382.
McWilliams, J. C. (1980), ‘An application of equivalent modons to atmospheric blocking’,
Dynamics of Atmospheres and Oceans 5, 43–66.
McWilliams, J. C. (1984), ‘The emergence of isolated coherent vortices in turbulent flow’,
Journal of Fluid Mechanics 146, 21–43.
McWilliams, J. C. (1990a), ‘A demonstration of the suppression of turbulent cascades by
coherent vortices in two-dimensional turbulence’, Physics of Fluids A 2, 547–552.
McWilliams, J. C. (1990b), ‘The vortices of two-dimensional turbulence’, Journal of Fluid
Mechanics 219, 361–385.
Meena, M. G., Nair, A. G., and Taira, K. (2018), ‘Network community-based model reduction
for vortical flows’, Physical Review E 97(6), 063103.
Mendez, M. A., Hess, D., Watz, B. B., and Buchlin, J.-M. (2020), ‘Multiscale proper orthogonal
decomposition (MPOD) of TR-PIV data- A case study on stationary and transient cylinder
wake flows’, Measurement Science and Technology 80, 981–1002.
Mendez, M. A. and Buchlin, J.-M. (2016), ‘Notes on 2D pulsatile Poiseuille flows: An
introduction to eigenfunctions and complex variables using MATLAB’, Technical Report
TN 215, von Karman Institute for Fluid Dynamics, Sint-Genesius-Rode, Belgium.
Mendez, M. A., Scelzo, M., and Buchlin, J.-M. (2018), ‘Multiscale modal analysis of an
oscillating impinging gas jet’, Experimental Thermal and Fluid Science 91, 256–276.
Mendez, M., Balabane, M., and Buchlin, J.-M. (2019), ‘Multi-scale proper orthogonal decom-
position of complex fluid flows’, Journal of Fluid Mechanics 870, 988–1036.
Mendez, M., Gosset, A., and Buchlin, J.-M. (2019), ‘Experimental analysis of the stability of the
jet wiping process, part II: Multiscale modal analysis of the gas jet-liquid film interaction’,
Experimental Thermal and Fluid Science 106, 48–67.
Mendez, M., Raiola, M., Masullo, A., Discetti, S., Ianiro, A., Theunissen, R., and Buchlin, J.-
M. (2017), ‘POD-based background removal for particle image velocimetry’, Experimental
Thermal and Fluid Science 80, 181–192.
Mendible, A., Brunton, S. L., Aravkin, A. Y., Lowrie, W., and Kutz, J. N. (2020), ‘Dimension-
ality reduction and reduced-order modeling for traveling wave physics’, Theoretical and
Computational Fluid Dynamics 34(4), 385–400.
Meneveau, C. (1991), ‘Analysis of turbulence in the orthonormal wavelet representation’,
Journal of Fluid Mechanics 232, 469–520.
Meneveau, C. and Katz, J. (2000), ‘Scale-invariance and turbulence models for large-eddy
simulation’, Annual Review of Fluid Mechanics 32(1), 1–32.
Meunier, P., Le Dizès, S., and Leweke, T. (2005), ‘Physics of vortex merging’, C. R. Physique
6, 431–450.
Meyers, S. D., Kelly, B. G., and O’Brien, J. J. (1993), ‘An introduction to wavelet analysis in
oceanography and meteorology: With application to the dispersion of Yanai waves’, Monthly
Weather Review 121(10), 2858–2866.
Mezić, I. (2005), ‘Spectral properties of dynamical systems, model reduction and decomposi-
tions’, Nonlinear Dynamics 41(1–3), 309–325.
Mezić, I. (2013), ‘Analysis of fluid flows via spectral properties of the Koopman operator’,
Annual Review of Fluid Mechanics 45, 357–378.
Mezić, I. and Banaszuk, A. (2004), ‘Comparison of systems with complex behavior’, Physica
D: Nonlinear Phenomena 197(1), 101–133.
Mifsud, M., Vendl, A., Hansen, L.-U., and Görtz, S. (2019), ‘Fusing wind-tunnel measurements
and CFD data using constrained gappy proper orthogonal decomposition’, Aerospace
Science and Technology 86, 312–326.
Mifsud, M., Zimmermann, R., and Görtz, S. (2014), ‘Speeding-up the computation of high-
lift aerodynamics using a residual-based reduced-order model’, CEAS Aeronautical Journal
6(1), 3–16.
Milano, M. and Koumoutsakos, P. (2002), ‘Neural network modeling for near wall turbulent
flow’, Journal of Computational Physics 182(1), 1–26.
Min, C. and Choi, H. (1999), ‘Suboptimal feedback control of vortex shedding at low reynolds
numbers’, Journal of Fluid Mechanics 401, 123–156.
Minelli, G., Dong, T., Noack, B. R., and Krajnović, S. (2020), ‘Upstream actuation for
bluff-body wake control driven by a genetically inspired optimization’, Journal of Fluid
Mechanics 893, A1.
Mizuno, Y. and Jiménez, J. (2013), ‘Wall turbulence without walls’, Journal of Fluid Mechanics
723, 429–455.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., and
Kavukcuoglu, K. (2016), Asynchronous methods for deep reinforcement learning, in M. F.
Balcan and K. Q. Weinberger, eds., ‘Proceedings of the 33rd International Conference on
Machine Learning’, Vol. 48 of Proceedings of Machine Learning Research, PMLR, New
York, pp. 1928–1937.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves,
A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A.,
Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. (2015),
‘Human-level control through deep reinforcement learning’, Nature 518(7540), 529–533.
Moarref, R., Jovanovic, M., Tropp, J., Sharma, A., and McKeon, B. (2014), ‘A low-order
decomposition of turbulent channel flow via resolvent analysis and convex optimization’,
Physics Fluids 26, 051701.
Mohammed, R., Tanoff, M., Smooke, M., Schaffer, A., and Long, M. (1998), ‘Computational
and experimental study of a forced, time-varying, axisymmetric, laminar diffusion flame’,
Symposium (International) on Combustion 27(1), 693–702.
Moisy, F. and Jiménez, J. (2004), ‘Geometry and clustering of intense structures in isotropic
turbulence’, Journal of Fluid Mechanics 513, 111–133.
Mojgani, R. and Balajewicz, M. (2017), ‘Lagrangian basis method for dimensionality reduction
of convection dominated nonlinear flows’, arXiv preprint arXiv:1701.04343.
Noack, B. R. (2019), Closed-loop turbulence control – From human to machine learning (and
retour), in Y. Zhou, M. Kimura, G. Peng, A. D. Lucey, and L. Hung, eds., ‘Fluid-Structure-
Sound Interactions and Control. Proceedings of the 4th Symposium on Fluid-Structure-
Sound Interactions and Control’, Springer, Berlin, pp. 23–32.
Noack, B. R., Afanasiev, K., Morzyński, M., Tadmor, G., and Thiele, F. (2003), ‘A hierarchy of
low-dimensional models for the transient and post-transient cylinder wake’, Journal of Fluid
Mechanics 497, 335–363.
Noack, B. R. and Eckelmann, H. (1994), ‘A global stability analysis of the steady and periodic
cylinder wake’, Journal of Fluid Mechanics 270, 297–330.
Noack, B. R., Mezić, I., Tadmor, G., and Banaszuk, A. (2004), ‘Optimal mixing in recirculation
zones’, Physics of Fluids 16(4), 867–888.
Noack, B. R., Morzynski, M., and Tadmor, G. (2011), Reduced-Order Modelling for Flow
Control, Vol. 528 of International Centre for Mechanical Sciences: Courses and Lectures,
Springer Science & Business Media, Dordrecht.
Noack, B. R., Papas, P., and Monkewitz, P. A. (2005), ‘The need for a pressure-term
representation in empirical Galerkin models of incompressible shear flows’, Journal of Fluid
Mechanics 523, 339.
Noack, B. R., Pelivan, I., Tadmor, G., Morzyński, M., and Comte, P. (2004), Robust low-
dimensional Galerkin models of natural and actuated flows, in ‘Fourth Aeroacoustics
Workshop’, RWTH, Aachen, February 26–27, 2004, pp. 0001–0012.
Noack, B. R., Schlegel, M., Ahlborn, B., Mutschke, G., Morzyński, M., Comte, P., and
Tadmor, G. (2008), ‘A finite-time thermodynamics of unsteady fluid flows’, Journal of Non-
Equilibrium Thermodynamics 33(2), 103–148.
Noack, B. R., Stankiewicz, W., Morzynski, M., and Schmid, P. J. (2016), ‘Recursive dynamic
mode decomposition of a transient cylinder wake’, Journal of Fluid Mechanics 809, 843–
872.
Nocedal, J. and Wright, S. (2006), Numerical Optimization, Springer Verlag, New York.
Noé, F. and Nuske, F. (2013), ‘A variational approach to modeling slow processes in stochastic
dynamical systems’, Multiscale Modeling Simulation 11(2), 635–655.
Noé, F., Olsson, S., Köhler, J., and Wu, H. (2019), ‘Boltzmann generators: Sampling equilib-
rium states of many-body systems with deep learning’, Science 365(6457), eaaw1147.
Noether, E. (1918), ‘Invariante Variationsprobleme’, Nachrichten von der Gesellschaft der
Wissenschaften zu Göttingen, Mathematisch-Physikalische Klasse 1918, 235–257. English
Reprint: physics/0503066, http://dx. doi. org/10.1080/00411457108231446, p. 57.
Novati, G., Mahadevan, L., and Koumoutsakos, P. (2019), ‘Controlled gliding and perching
through deep-reinforcement-learning’, Physical Review Fluids 4(9), 093902.
Novati, G., Verma, S., Alexeev, D., Rossinelli, D., Van Rees, W. M., and Koumoutsakos, P.
(2017), ‘Synchronisation through learning for two self-propelled swimmers’, Bioinspiration
& Biomimetics 12(3), aa6311.
Nüske, F., Schneider, R., Vitalini, F., and Noé, F. (2016), ‘Variational tensor approach for
approximating the rare-event kinetics of macromolecular systems’, The Journal of Chemical
Physics 144(5), 054105.
Obukhov, A. M. (1941), ‘On the distribution of energy in the spectrum of turbulent flow’,
Doklady Akademii Nauk SSSR 32, 22–24.
Ogata, K. (2009), Modern Control Engineering, 5th ed., Pearson, Upper Saddle River, NJ.
Onsager, L. (1949), ‘Statistical hydrodynamics’, Nuovo Cimento Suppl. 6, 279–286.
Oppenheim, A. V. (2015), Signals, Systems and Inference, Pearson Education, Harlow.
August 2017’, Vol. 70 of Proceedings of Machine Learning Research, PMLR, pp. 2778–
2787.
Peerenboom, K., Parente, A., Kozák, T., Bogaerts, A., and Degrez, G. (2015), ‘Dimension
reduction of non-equilibrium plasma kinetic models using principal component analysis’,
Plasma Sources Science and Technology 24(2), 025004.
Penland, C. (1996), ‘A stochastic model of IndoPacific sea surface temperature anomalies’,
Physica D: Nonlinear Phenomena 98(2–4), 534–558.
Penland, C. and Magorian, T. (1993), ‘Prediction of Niño 3 sea surface temperatures using
linear inverse modeling’, Journal of Climate 6(6), 1067–1076.
Pepiot-Desjardins, P. and Pitsch, H. (2008), ‘An efficient error-propagation-based reduction
method for large chemical kinetic mechanisms’, Combustion and Flame 154(1–2), 67–81.
Perko, L. (2013), Differential Equations and Dynamical Systems, Vol. 7 of Texts in Applied
Mathematics, Springer Science & Business Media, New York.
Perlman, E., Burns, R., Li, Y., and Meneveau, C. (2007), Data exploration of turbulence
simulations using a database cluster, in ‘Proceedings of the SC07’, ACM, New York,
pp. 23.1–23.11.
Perrin, R., Braza, M., Cid, E., Cazin, S., Barthet, A., Sevrain, A., Mockett, C., and Thiele,
F. (2007), ‘Obtaining phase averaged turbulence properties in the near wake of a circular
cylinder at high Reynolds number using POD’, Experiments in Fluids 43(2–3), 341–355.
Peters, N. (1984), ‘Laminar diffusion flamelet models in non-premixed turbulent combustion’,
Progress in Energy and Combustion Science 10(3), 319–339.
Phan, M. Q., Juang, J.-N., and Hyland, D. C. (1995), ‘On neural networks in identification and
control of dynamic systems,’ in Ardéshir Guran and Daniel J Inman, eds., ‘Wave Motion,
Intelligent Structures and Nonlinear Mechanics’, World Scientific, Singapore, pp. 194–225.
Picard, C. and Delville, J. (2000), ‘Pressure velocity coupling in a subsonic round jet’,
International Journal of Heat and Fluid Flow 21(3), 359–364.
Poincaré, H. (1920), Science et méthode, Flammarion, Paris. English translation in Dover books,
1952.
Pope, S. B. (2000), Turbulent Flows, Cambridge University Press, Cambridge.
Pope, S. B. (2013), ‘Small scales, many species and the manifold challenges of turbulent
combustion’, Proceedings of the Combustion Institute 34(1), 1–31.
Proctor, J. L., Brunton, S. L., and Kutz, J. N. (2016), ‘Dynamic mode decomposition with
control’, SIAM Journal on Applied Dynamical Systems 15(1), 142–161.
Protas, B. (2004), ‘Linear feedback stabilization of laminar vortex shedding based on a point
vortex model’, Physics of Fluids 16(12), 4473–4488.
Quarteroni, A. and Rozza, G. (2013), Reduced Order Methods for Modeling and Computational
Reduction, Vol. 9 of MS&A – Modeling, Simulation & Appplications, Springer, New York.
Rabault, J. (2019), ‘Deep reinforcement learning applied to fluid mechanics: Materials
from the 2019 Flow/Interface School on Machine Learning and Data Driven Methods’,
DOI:10.13140/RG.2.2.11533.90086.
Rabault, J., Belus, V., Viquerat, J., Che, Z., Hachem, E., Reglade, U., and Jensen, A. (2019),
Exploiting locality and physical invariants to design effective deep reinforcement learning
control of the unstable falling liquid film, in ‘The 1st Graduate Forum of CSAA and the
7th International Academic Conference for Graduates’, NUAA, November 21–22, 2019,
Nanjing, China.
Rabault, J., Kuchta, M., Jensen, A., Reglade, U., and Cerardi, N. (2019), ‘Artificial neural
networks trained through deep reinforcement learning discover control strategies for active
flow control’, Journal of Fluid Mechanics 865, 281–302.
Rabault, J. and Kuhnle, A. (2019), ‘Accelerating deep reinforcement learning strategies of flow
control through a multi-environment approach’, Physics of Fluids 31(9), 094105.
Rabault, J. and Kuhnle, A. (2020), ‘Deep reinforcement learning applied to active flow control’,
ResearchGate Preprint https://doi.org/10.13140/RG 2.
Rabault, J., Ren, F., Zhang, W., Tang, H., and Xu, H. (2020), ‘Deep reinforcement learning in
fluid mechanics: A promising method for both active flow control and shape optimization’,
Journal of Hydrodynamics 32(2), 234–246.
Raibaudo, C., Zhong, P., Noack, B. R., and Martinuzzi, R. J. (2020), ‘Machine learning
strategies applied to the control of a fluidic pinball’, Physics of Fluids 32(1), 015108.
Raiola, M., Discetti, S., and Ianiro, A. (2015), ‘On PIV random error minimization with optimal
POD-based low-order reconstruction’, Experiments in Fluids 56(4), 75.
Raiola, M., Ianiro, A., and Discetti, S. (2016), ‘Wake of tandem cylinders near a wall’,
Experimental Thermal and Fluid Science 78, 354–369.
Raissi, M. and Karniadakis, G. E. (2018), ‘Hidden physics models: Machine learning of
nonlinear partial differential equations’, Journal of Computational Physics 357, 125–141.
Raissi, M., Perdikaris, P., and Karniadakis, G. (2019), ‘Physics-informed neural networks:
A deep learning framework for solving forward and inverse problems involving nonlinear
partial differential equations’, Journal of Computational Physics 378, 686–707.
Raissi, M., Yazdani, A., and Karniadakis, G. E. (2020), ‘Hidden fluid mechanics: Learning
velocity and pressure fields from flow visualizations’, Science 367(6481), 1026–1030.
Ranzi, E., Frassoldati, A., Stagni, A., Pelucchi, M., Cuoci, A., and Faravelli, T. (2014), ‘Reduced
kinetic schemes of complex reaction systems: fossil and biomass-derived transportation
fuels’, International Journal of Chemical Kinetics 46(9), 512–542.
Rasmussen, C. (2006), Gaussian Processes for Machine Learning, MIT Press, Cambridge, MA.
Rayleigh, L. (1887), ‘On the stability of certain fluid motions’, The Proceedings of the London
Mathematical Society 19, 67–74.
Recht, B. (2019), ‘A tour of reinforcement learning: The view from continuous control’, Annual
Review of Control, Robotics, and Autonomous Systems 2, 253–279.
Reddy, G., Celani, A., Sejnowski, T. J., and Vergassola, M. (2016), ‘Learning to soar in turbulent
environments’, Proceedings of the National Academy of Sciences of the United States of
America 113(33), E4877–E4884.
Reddy, G., Wong-Ng, J., Celani, A., Sejnowski, T. J., and Vergassola, M. (2018), ‘Glider soaring
via reinforcement learning in the field’, Nature 562(7726), 236–239.
Reddy, S. and Henningson, D. (1993), ‘Energy growth in viscous channel flows’, Journal of
Fluid Mechanics 252, 209–238.
Reimer, L., Heinrich, R., and Ritter, M. (2019), Towards higher-precision maneuver and gust
loads computations of aircraft: Status of related features in the CFD-based multidisciplinary
simulation environment FlowSimulator, in ‘Notes on Numerical Fluid Mechanics and
Multidisciplinary Design’, Springer, Cham, pp. 597–607.
Reiss, J., Schulze, P., Sesterhenn, J., and Mehrmann, V. (2018), ‘The shifted proper orthogonal
decomposition: A mode decomposition for multiple transport phenomena’, SIAM Journal
on Scientific Computing 40(3), A1322–A1344.
Rempfer, D. (1995), Empirische Eigenfunktionen und Galerkin-Projektionen zur Beschreibung
des laminar-turbulenten Grenzschichtumschlags (transl.: Empirical eigenfunctions and
Schmid, P. and Henningson, D. (2001), Stability and Transition in Shear Flows, Springer Verlag,
New York.
Schmid, P. J. (2010), ‘Dynamic mode decomposition of numerical and experimental data’,
Journal of Fluid Mechanics 656, 5–28.
Schmid, P., Li, L., Juniper, M., and Pust, O. (2011), ‘Applications of the dynamic mode
decomposition’, Theoretical and Computational Fluid Dynamics 25(1–4), 249–259.
Schmid, P., Violato, D., and Scarano, F. (2012), ‘Decomposition of time-resolved tomographic
PIV’, Experiments in Fluids 52(6), 1567–1579.
Schmidt, M. and Lipson, H. (2009), ‘Distilling free-form natural laws from experimental data’,
Science 324(5923), 81–85.
Schmidt, O. T. and Towne, A. (2019), ‘An efficient streaming algorithm for spectral proper
orthogonal decomposition’, Computer Physics Communications 237, 98–109.
Schneider, K. and Vasilyev, O. V. (2010), ‘Wavelet methods in computational fluid dynamics’,
Annual Review of Fluid Mechanics 42, 473–503.
Schölkopf, B. and Smola, A. J. (2002), Learning With Kernels: Support Vector Machines,
Regularization, Optimization, and Beyond, MIT Press, Cambridge.
Schoppa, W. and Hussain, F. (2002), ‘Coherent structure generation in near-wall turbulence’,
Journal of Fluid Mechanics 453, 57–108.
Schulman, J., Levine, S., Abbeel, P., Jordan, M., and Moritz, P. (2015), Trust region policy
optimization, in F. Bach and D. Blei, eds., ‘Proceedings of the 32nd International Conference
on Machine Learning’, Vol. 37 of Proceedings of Machine Learning Research, PMLR, Lille,
France, pp. 1889–1897.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017), ‘Proximal policy
optimization algorithms’, CoRR, arXiv:1707.06347.
Schulze, J., Schmid, P., and Sesterhenn, J. (2009), ‘Exponential time integration using Krylov
subspaces’, The International Journal for Numerical Methods in Fluids 60(6), 591–609.
Schumm, M., Berger, E., and Monkewitz, P. A. (1994), ‘Self-excited oscillations in the wake of
two-dimensional bluff bodies and their control’, Journal of Fluid Mechanics 271, 17–53.
Schwer, D., Lu, P. and Green Jr., W. (2003), ‘An adaptive chemistry approach to modeling
complex kinetics in reacting flows’, Combustion and Flame 133(4), 451–465.
Seber, G. A. (2009), Multivariate Observations, John Wiley & Sons, Hoboken, NJ.
Semaan, R., Fernex, D., Weiner, A., and Noack, B. R. (2020), xROM – A Toolkit for Reduced-
Order Modeling of Fluid Flows, Vol. 1 of Machine Learning Tools for Fluid Mechanics,
Technische Universität Braunschweig, Braunschweig.
Semaan, R., Kumar, P., Burnazzi, M., Tissot, G., Cordier, L. and Noack, B. R. (2015), ‘Reduced-
order modelling of the flow around a high-lift configuration with unsteady Coanda blowing’,
Journal of Fluid Mechanics 800, 72–110.
Semeraro, O., Bagheri, S., Brandt, L., and Henningson, D. S. (2011), ‘Feedback control of three-
dimensional optimal disturbances using reduced-order models’, Journal of Fluid Mechanics
677, 63–102.
Semeraro, O., Bagheri, S., Brandt, L., and Henningson, D. S. (2013), ‘Transition delay in a
boundary layer flow using active control’, Journal of Fluid Mechanics 731, 288–311.
Semeraro, O., Lusseyran, F., Pastur, L. and Jordan, P. (2017), ‘Qualitative dynamics of wave
packets in turbulent jets’, Physical Review Fluids 2(9), 094605.
Semeraro, O. and Mathelin, L. (2016), ‘An open-source toolbox for data-driven linear system
identification’, Technical report, LIMSI-CNRS, Orsay, France.
Sharma, A. S., Mezić, I., and McKeon, B. J. (2016), ‘Correspondence between Koopman mode
decomposition, resolvent mode decomposition, and invariant solutions of the Navier-Stokes
equations’, Physical Review Fluids 1(3), 032402.
Sieber, M., Paschereit, C. O., and Oberleithner, K. (2016), ‘Spectral proper orthogonal
decomposition’, Journal of Fluid Mechanics 792, 798–828.
Siegel, S. G., Seidel, J., Fagley, C., Luchtenburg, D. M., Cohen, K., and McLaughlin, T. (2008),
‘Low dimensional modelling of a transient cylinder wake using double proper orthogonal
decomposition’, Journal of Fluid Mechanics 610, 1–42.
Sigurd Skogestad, I. P. (2005), Multivariable Feedback Control, John Wiley & Sons, Hoboken,
NJ.
Sillero, J. A. (2014), High Reynolds numbers turbulent boundary layers, PhD thesis, Aeronau-
tics, Universidad Politécnica, Madrid. https://tesis.biblioteca.upm.es/tesis/7538.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser,
J., Antonoglou, I., Panneershelvam, V., Lanctot, M. Sander, D., Dominik, G., John, N.,
Nal, K., Ilya, S., Timothy, L., Madeleine, L., Koray, K., Thore, G., and Demis H.
(2016), ‘Mastering the game of Go with deep neural networks and tree search’, Nature
529(7587), 484–489.
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L.,
Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., and Hassabis, D. (2018), ‘A general
reinforcement learning algorithm that masters Chess, Shogi, and Go through self-play’,
Science 362(6419), 1140–1144.
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014), Deter-
ministic policy gradient algorithms, in E. P. Xing and T. Jebara, eds., ‘Proceedings of the
31st International Conference on Machine Learning’, Vol. 32 of Proceedings of Machine
Learning Research, PMLR, Bejing, China, pp. 387–395.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T.,
Baker, L., Lai, M., and Bolton, A. (2017), ‘Mastering the game of Go without human
knowledge’, Nature 550(7676), 354.
Singh, A. P., Medida, S., and Duraisamy, K. (2017), ‘Machine-learning-augmented predictive
modeling of turbulent separated flows over airfoils’, AIAA Journal 55, 2215–2227.
Sirovich, L. (1987), ‘Turbulence and the dynamics of coherent structures. I–Coherent structures.
II–Symmetries and transformations. III–Dynamics and scaling’, Quarterly of Applied
Mathematics 45, 561–571.
Sirovich, L. and Kirby, M. (1987), ‘A low-dimensional procedure for the characterization of
human faces’, Journal of the Optical Society of America A 4(3), 519–524.
Skene, C., Eggl, M., and Schmid, P. (2020), ‘A parallel-in-time approach for accelerating direct-
adjoint studies’, Journal of Computational Physics 429(2), 110033.
Skogestad, S. and Postlethwaite, I. (2005), Multivariable Feedback Control: Analysis and
Design, 2 ed., John Wiley & Sons, Hoboken, NJ.
Skogestad, S. and Postlethwaite, I. (2007), Multivariable Feedback Control: Analysis and
Design, Vol. 2, Wiley, New York.
Smith, J. O. (2007a), Introduction to Digital Filters: With Audio Applications, W3K Publishing,
http://books.w3k.org/.
Smith, J. O. (2007b), Mathematics of the Discrete Fourier Transform (DFT): With Audio
Applications, W3K Publishing, http://books.w3k.org/.
Smith, S. W. (1997), The Scientist & Engineer’s Guide to Digital Signal Processing, California
Technical Publishing, San Diego, www.dspguide.com/.
Taira, K., Hemati, M. S., Brunton, S. L., Sun, Y., Duraisamy, K., Bagheri, S., Dawson, S., and
Yeh, C.-A. (2019), ‘Modal analysis of fluid flows: Applications and outlook’, arXiv preprint
arXiv:1903.05750 .
Takeishi, N., Kawahara, Y., and Yairi, T. (2017), Learning Koopman invariant subspaces for
dynamic mode decomposition, in ‘Advances in Neural Information Processing Systems’,
December 4–9, 2017, Long Beach, CA, pp. 1130–1140.
Takens, F. (1981), Detecting strange attractors in turbulence, in ‘Dynamical systems and
turbulence, Warwick 1980’, Vol. 898 in Lecture Notes in Math., Springer, Heidelberg and
New York, pp. 366–381.
Talamelli, A., Persiani, F., Fransson, J. H., Alfredsson, P. H., Johansson, A. V., Nagib, H. M.,
Rüedi, J.-D., Sreenivasan, K. R., and Monkewitz, P. A. (2009), ‘CICLoPE —- A response to
the need for high Reynolds number experiments’, Fluid Dynamics Research 41(2), 021407.
Tanahashi, M., Kang, S., Miyamoto, T., and Shiokawa, S. (2004), ‘Scaling law of fine scale
eddies in turbulent channel flows up to Reτ = 800’, International Journal of Heat and Fluid
Flow 25, 331–341.
Tang, H., Rabault, J., Kuhnle, A., Wang, Y., and Wang, T. (2020), ‘Robust active flow control
over a range of reynolds numbers using an artificial neural network trained through deep
reinforcement learning’, Physics of Fluids 32(5), 053605.
Tedrake, R., Jackowski, Z., Cory, R., Roberts, J. W., and Hoburg, W. (2009), Learning to fly
like a bird, in ‘14th International Symposium on Robotics Research’, Lucerne, Switzerland.
Tenenbaum, J. B. (2000), ‘A global geometric framework for nonlinear dimensionality reduc-
tion’, Science 290(5500), 2319–2323.
Tennekes, H. and Lumley, J. L. (1972), A First Course in Turbulence, MIT Press, Cambridge,
MA.
Thaler, S., Paehler, L., and Adams, N. A. (2019), ‘Sparse identification of truncation errors’,
Journal of Computational Physics 397, 108851.
Theofilis, V. (2011), ‘Global linear instability’, Annual Review of Fluid Mechanics 43, 319–352.
Thiria, B., Goujon-Durand, S., and Wesfreid, J. E. (2006), ‘The wake of a cylinder performing
rotary oscillations’, Journal of Fluid Mechanics 560, 123–147.
Thormann, R. and Widhalm, M. (2013), ‘Linear-frequency-domain predictions of dynamic-
response data for viscous transonic flows’, AIAA Journal 51(11), 2540–2557.
Tibshirani, R. (1996), ‘Regression shrinkage and selection via the lasso’, Journal of the Royal
Statistical Society. Series B (Methodological) 58(1), 267–288.
Tinney, C. E., Ukeiley, L., and Glauser, M. N. (2008), ‘Low-dimensional characteristics of a
transonic jet. Part 2. Estimate and far-field prediction’, Journal of Fluid Mechanics 615, 53–
92.
Toedtli, S. S., Luhar, M., and McKeon, B. J. (2019), ‘Predicting the response of turbulent
channel flow to varying-phase opposition control: Resolvent analysis as a tool for flow
control design’, Physical Review Fluids 4(7), 073905.
Torrence, C. and Compo, G. P. (1998), ‘A practical guide to wavelet analysis’, Bulletin of the
American Meteorological Society 79(1), 61–78.
Towne, A., Schmidt, O. T., and Colonius, T. (2018), ‘Spectral proper orthogonal decomposition
and its relationship to dynamic mode decomposition and resolvent analysis’, Journal of
Fluid Mechanics 847, 821–867.
Townsend, A. A. (1976), The Structure of Turbulent Shear Flow, Cambridge University Press,
Cambridge.
Tran, G. and Ward, R. (2016), ‘Exact recovery of chaotic systems from highly corrupted data’,
arXiv preprint arXiv:1607.01067.
Trefethen, L. and Bau, D. (1997), Numerical Linear Algebra, Society for Industrial and Applied
Mathematics, Philadelphia.
Trefethen, L., Trefethen, A., Reddy, S., and Driscoll, T. (1993), ‘Hydrodynamic stability without
eigenvalues’, Science 261, 578–584.
Tropp, J. A. and Gilbert, A. C. (2007), ‘Signal recovery from random measurements via
orthogonal matching pursuit’, IEEE Transactions on Information Theory 53(12), 4655–
4666.
Tu, J. H. and Rowley, C. W. (2012), ‘An improved algorithm for balanced POD through an ana-
lytic treatment of impulse response tails’, Journal of Computational Physics 231(16), 5317–
5333.
Tu, J. H., Rowley, C. W., Kutz, J. N., and Shang, J. K. (2014), ‘Spectral analysis of fluid flows
using sub-Nyquist-rate PIV data’, Experiments in Fluids 55(9), 1–13.
Tu, J., Rowley, C., Luchtenburg, D., Brunton, S., and Kutz, J. (2014), ‘On dynamic mode
decomposition: theory and applications’, Journal of Computational Dynamics 1(2), 391–
421.
Turns, S. R. (1996), Introduction to Combustion, Vol. 287, McGraw-Hill Companies, New York.
Uruba, V. (2012), ‘Decomposition methods in turbulence research’, EPJ Web of Conferences
25, 01095.
Van den Berg, J. (2004), Wavelets in Physics, Cambridge University Press, Cambridge.
Van Loan, C. F. and Golub, G. H. (1983), Matrix Computations, Johns Hopkins University
Press, Baltimore.
Vassberg, J., Dehaan, M., Rivers, M., and Wahls, R. (2008), Development of a common
research model for applied CFD validation studies, in ‘26th AIAA Applied Aerodynamics
Conference’, August 18-21, 2008, Honolulu, HI, American Institute of Aeronautics and
Astronautics, Reston, VA.
Venturi, D. (2006), ‘On proper orthogonal decomposition of randomly perturbed fields with
applications to flow past a cylinder and natural convection over a horizontal plate’, Journal
of Fluid Mechanics 559, 215–254.
Verma, S., Novati, G., and Koumoutsakos, P. (2018), ‘Efficient collective swimming by
harnessing vortices through deep reinforcement learning’, Proceedings of the National
Academy of Sciences 115(23), 5849–5854.
Vincent, A. and Meneguzzi, M. (1991), ‘The spatial structure and statistical properties of
homogeneous turbulence’, Journal of Fluid Mechanics 225, 1–20.
Viquerat, J., Rabault, J., Kuhnle, A., Ghraieb, H., and Hachem, E. (2019), ‘Direct shape
optimization through deep reinforcement learning’, arXiv preprint arXiv:1908.09885.
Vlachas, P. R., Byeon, W., Wan, Z. Y., Sapsis, T. P., and Koumoutsakos, P. (2018), ‘Data-driven
forecasting of high-dimensional chaotic systems with long short-term memory networks’,
Proceedings of the Royal Society A 474(2213), 20170844.
Vladimir Cherkassky, F. M. M. (2008), Learning from Data, John Wiley & Sons, New York.
Voit, E. O. (2019), ‘Perspective: dimensions of the scientific method’, PLoS: Computational
Biology 15(9), e1007279.
Voltaire, F. (1764), Dictionnaire Philosophique: Atomes, Oxford University Press, 1994.
von Helmholtz, H. (1858), ‘Über integrale der hydrodynamischen gleichungen, welche den
wirbelbewegungen entsprechen’, J. für die reine und angewandte Mathematik 55, 25–55.
von Kármán, T. (1911), ‘über den Mechanismus des Widerstandes, den ein bewegter Körper in
einer Flüssigkeit erfährt’, Nachrichten der Kaiserlichen Gesellschaft der Wissenschaften zu
Göttingen, pp. 509–517.
von Storch, H. and Xu, J. (1990), ‘Principal oscillation pattern analysis of the 30- to 60-day
oscillation in the tropical troposphere’, Climate Dynamics 4(3), 175–190.
Voropayev, S. I., Afanasyev, Y. D., and Filippov, I. A. (1991), ‘Horizontal jets and vortex dipoles
in a stratified fluid’, Journal of Fluid Mechanics 227, 543–566.
Voss, R., Tichy, L., and R.Thormann (2011), A ROM based flutter prediction process and
its validation with a new reference model, in ‘International Forum on Aeroelasticity and
Structural Dynamics (IFASD-2011)’, June 26–30, 2011, Paris, France.
Vukasonivic, B., Rusak, Z., and Glezer, A. (2010), ‘Dissipative small-scale actuation of a
turbulent shear layer’, Journal of Fluid Mechanics 656, 51–81.
Wahde, M. (2008), Biologically Inspired Optimization Methods: An Introduction, WIT Press,
Southampton.
Wan, Z. Y., Vlachas, P., Koumoutsakos, P., and Sapsis, T. (2018), ‘Data-assisted reduced-order
modeling of extreme events in complex dynamical systems’, PloS One 13(5), e0197704.
Wang, J.-X., Wu, J.-L., and Xiao, H. (2017), ‘Physics-informed machine learning approach for
reconstructing reynolds stress modeling discrepancies based on DNS data’, Physical Review
Fluids 2(3), 034603.
Wang, M. and Hemati, M. S. (2017), ‘Detecting exotic wakes with hydrodynamic sensors’,
arXiv preprint arXiv:1711.10576 .
Wang, Q., Moin, P., and Iaccarino, G. (2009), ‘Minimal repetition dynamic checkpointing algo-
rithm for unsteady adjoint calculation’, SIAM Journal of Scientific Computing 31(4), 2549–
2567.
Wang, R. (2009), Introduction to Orthogonal Transforms, Cambridge University Press, New
York.
Wang, W. X., Yang, R., Lai, Y. C., Kovanis, V., and Grebogi, C. (2011), ‘Predicting catas-
trophes in nonlinear dynamical systems by compressive sensing’, Physical Review Letters
106, 154101-1–154101-4.
Wang, Y., Yao, H., and Zhao, S. (2016), ‘Auto-encoder based dimensionality reduction’, Neuro-
Computing 184, 232–242.
Wang, Z., Akhtar, I., Borggaard, J., and Iliescu, T. (2012), ‘Proper orthogonal decomposition
closure models for turbulent flows: a numerical comparison’, Computer Methods in Applied
Mechanics and Engineering 237, 10–26.
Wang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., and de Freitas, N.
(2017), Sample efficient actor-critic with experience replay, in ‘5th International Conference
on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference
Track Proceedings’, OpenReview.net.
Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., and De Freitas, N. (2016),
Dueling network architectures for deep reinforcement learning, in ‘Proceedings of the 33rd
International Conference on International Conference on Machine Learning – Volume 48’,
ICML 16, JMLR.org, pp. 1995–2003.
Watkins, C. J. C. H. (1989), Learning from Delayed Rewards, PhD thesis, King’s College,
Cambridge, UK.
Wehmeyer, C. and Noé, F. (2018), ‘Time-lagged autoencoders: Deep learning of slow collective
variables for molecular kinetics’, The Journal of Chemical Physics 148(241703), 1–9.
Welch, P. (1967), ‘The use of fast Fourier transform for the estimation of power spectra:
A method based on time averaging over short, modified periodograms’, IEEE Transactions
on Audio and Electroacoustics 15(2), 70–73.
Weller, J., Camarri, S., and Iollo, A. (2009), ‘Feedback control by low-order modelling of the
laminar flow past a bluff body’, Journal of Fluid Mechanics 634, 405.
White, A. P., Zhu, G., and Choi, J. (2013), Linear Parameter-Varying Control for Engineering
Applications, Springer, London.
Whittle, P. (1951), Hypothesis Testing in Time Series Analysis, Almqvist & Wiksells, Uppsala.
Wiener, N. (1948), Cybernetics: Control and Communication in the Animal and the Machine,
Wiley, New York.
Wigner, E. P. (1960), ‘The unreasonable effectiveness of mathematics in the natural sciences’,
Communications on Pure and Applied Mathematics 13, 1–14.
Willcox, K. and Peraire, J. (2002), ‘Balanced model reduction via the proper orthogonal
decomposition’, AIAA Journal 40(11), 2323–2330.
Willert, C. E. and Gharib, M. (1991), ‘Digital particle image velocimetry’, Experiments in
Fluids 10(4), 181–193.
Williams, M., Kevrekidis, I., and Rowley, C. (2015), ‘A data-driven approximation of the
Koopman operator: extending dynamic mode decomposition’, Journal of Nonlinear Science
6, 1307–1346.
Williams, M., Rowley, C., and Kevrekidis, I. (2015), ‘A kernel approach to data-driven
Koopman spectral analysis’, Journal of Computational Dynamics 2(2), 247–265.
Williams, R. J. (1992), ‘Simple statistical gradient-following algorithms for connectionist
reinforcement learning’, Machine Learning 8(3–4), 229–256.
Williamson, D. (1999), Discrete-Time Signal Processing, Springer, London.
Wright, J., Yang, A., Ganesh, A., Sastry, S., and Ma, Y. (2009), ‘Robust face recognition via
sparse representation’, IEEE Transactions on Pattern Analysis and Machine Intelligence
(PAMI) 31(2), 210–227.
Wünning, J. and Wünning, J. (1997), ‘Flameless oxidation to reduce thermal no-formation’,
Progress in Energy and Combustion Science 23(1), 81–94.
Xiao, H., Wu, J.-L., Wang, J.-X., Sun, R., and Roy, C. (2016), ‘Quantifying and reducing model-
form uncertainties in Reynolds-averaged Navier–Stokes simulations: A data-driven, physics-
informed Bayesian approach’, Journal of Computational Physics 324, 115–136.
Xu, H., Zhang, W., Deng, J., and Rabault, J. (2020), ‘Active flow control with rotating
cylinders by an artificial neural network trained by deep reinforcement learning’, Journal
of Hydrodynamics 32(2), 254–258.
Yan, X., Zhu, J., Kuang, M., and Wang, X. (2019), ‘Aerodynamic shape optimization using a
novel optimizer based on machine learning techniques’, Aerospace Science and Technology
86, 826–835.
Yang, J., Wright, J., Huang, T. S., and Ma, Y. (2010), ‘Image super-resolution via sparse
representation’, IEEE Transactions on Image Processing 19(11), 2861–2873.
Yang, Y., Pope, S. B., and Chen, J. H. (2013), ‘Empirical low-dimensional manifolds in
composition space’, Combustion and Flame 160(10), 1967–1980.
Yeh, C.-A. and Taira, K. (2019), ‘Resolvent-analysis-based design of airfoil separation control’,
Journal of Fluid Mechanics 867, 572–610.
Yetter, R. A., Dryer, F., and Rabitz, H. (1991), ‘A comprehensive reaction mechanism for carbon
monoxide/hydrogen/oxygen kinetics’, Combustion Science and Technology 79(1–3), 97–
128.
Yeung, E., Kundu, S. and Hodas, N. (2017), ‘Learning deep neural network representations for
Koopman operators of nonlinear dynamical systems’, arXiv preprint arXiv:1708.06850.
Zdravkovich, M. (1987), ‘The effects of interference between circular cylinders in cross flow’,
Journal of Fluids and Structures 1(2), 239–261.
Zdybał, K., Armstrong, E., Parente, A., and Sutherland, J. C. (2020), ‘PCAfold: Python software
to generate, analyze and improve PCA-derived low-dimensional manifolds’, SoftwareX
12, 100630.
Zdybał, K., Sutherland, J. C., Armstrong, E., and Parente, A. (2021), ‘State-space informed data
sampling on combustion manifolds’, Combustion and Flame (manuscript in preparation).
Zdybał, K., Sutherland, J. C. & Parente, A. (2022), ‘Manifold-informed state vector subset for
reduced-order modeling’, Proceedings of the Combustion Institute .
Zdybał, K., Armstrong, E., Sutherland, J. C. & Parente, A. (2022), ‘Cost function for low-
dimensional manifold topology assessment’, Scientific Reports 12(1), 1–19.
Zebib, A. (1987), ‘Stability of viscous flow past a circular cylinder’, Journal of Engineering
Mathematics 21, 155–165.
Zhang, H.-Q., Fey, U., Noack, B. R., König, M., and Eckelmann, H. (1995), ‘On the transition
of the cylinder wake’, Physics of Fluids 7(4), 779–795.
Zhang, W., Wang, B., Ye, Z., and Quan, J. (2012), ‘Efficient method for limit cycle flutter
analysis based on nonlinear aerodynamic reduced-order models’, AIAA Journal 50(5), 1019–
1028.
Zheng, P., Askham, T., Brunton, S. L., Kutz, J. N., and Aravkin, A. Y. (2019), ‘Sparse relaxed
regularized regression: SR3’, IEEE Access 7(1), 1404–1423.
Zhong, Y. D. and Leonard, N. (2020), ‘Unsupervised learning of lagrangian dynamics from
images for prediction and control’, Advances in Neural Information Processing Systems 33.
Zhou, K. and Doyle, J. C. (1998), Essentials of Robust Control, Prentice Hall, Upper Saddle
River, NJ.
Zhou, Y., Fan, D., Zhang, B., Li, R., and Noack, B. R. (2020), ‘Artificial intelligence control of
a turbulent jet’, Journal of Fluid Mechanics 897, 1–46.
Zimmermann, R. and Görtz, S. (2010), ‘Non-linear reduced order models for steady aerody-
namics’, Procedia Computer Science 1(1), 165–174.
Zimmermann, R. and Görtz, S. (2012), ‘Improved extrapolation of steady turbulent aerody-
namics using a non-linear POD-based reduced order model’, The Aeronautical Journal
116(1184), 1079–1100.
Zimmermann, R., Vendl, A., and Görtz, S. (2014), ‘Reduced-order modeling of steady flows
subject to aerodynamic constraints’, AIAA Journal 52(2), 255–266.
Zou, H. and Hastie, T. (2005), ‘Regularization and variable selection via the elastic net’, Journal
of the Royal Statistical Society: Series B (Statistical Methodology) 67(2), 301–320.