Delicio Project

Delicio is an ongoing research project funded by French Agency ANR, (2019-2023). it proposes fundamental and research in the areas of Machine Learning/IA and process control with applications to drone (UAV) fleet control.

The last years have witnessed the soaring of Machine Learning (ML), which has provided disruptive performance gains in several fields. Apart from undeniable advances in methodology, these gains are often attributed to massive amounts of training data and computing power, which led to breakthroughs in speech recognition, computer vision and natural language processing. In this project, we propose to extend these advances to sequential decision making of multiple agents for planning and control. We particularly target learning realistic behavior with multiple horizons, requiring long-term planning at the same time as short-term fine-grained control.

In the context of decentralized control of agents like UAVs, mobile robots etc, DeLiCio proposes fundamental contributions on the crossroads between IA/ML and Control Theory (CT), the second key methodology of this project, together with ML. The two fields, while being distinct, have a long history of interactions between them and as both fields mature, their overlap is more and more evident. CT aims to provide differential model-based approaches to solve stabilization and estimation problems. These model-driven approaches are powerful because they are based on a thorough understanding of the system and can leverage established physical relationships. However, nonlinear models usually need to be simplified and they have difficulty accounting for noisy data and non modeled uncertainties.

Machine Learning, on the other hand, aims at learning complex models from (often large amounts of) data and can provide data-driven models for a wide range of tasks. Markov Decision Processes (MDP) and Reinforcement Learning (RL) have traditionally provided a mathematically founded framework for control applications, where agents are required to learn policies from past interactions with an environment. In recent years, this methodology has been combined with deep neural networks, which play the role of high-capacity function approximators, and model the discrete or continuous policy function or a function of the cumulated reward of the agent, or both.

While in many applications learning has become the prevailing methodology, process control is still a field where control engineering cannot be replaced for many low level control problems, mainly due to lack of stability of learned controllers, and computational complexity in embedded settings.

DeLiCio proposes fundamental research on the crossroads of ML/IA and CT with planned algorithmic contributions on the integration of models, prior knowledge and learning in control and the perception action cycle:

  • data-driven learning and identification of physical models for control;
  • state representation learning for control;
  • stability and robustness priors for reinforcement learning;
  • stable decentralized (multi-agent) control using ML and CT.

The planned methodological advances of this project will be evaluated on a challenging application requiring planning as well as fine-grained control, namely the synchronization of a UAV swarm through learning. The objective is to learn strategies, which allow a swarm to solve a high-level goal (navigation, visual search) while at the same time maintaining a formation.

Partners

INSA-Lyon/LIRIS

INSA-Lyon/CITI

  • Jilles Dibangoye [web]
  • Olivier Simonin [web]
  • Ievgen Redko (LHC Laboratory, Saint Etienne) [web]

Université Lyon 1/LAGEPP

  • Madiha Nadri [web]
  • Vincent Andrieu [web]
  • Daniele Astolfi [web]
  • Laurent Bako (Ampere Laboratory) [web]
  • Giacomo Casadei (Ampere Laboratory) [web]

ONERA

  • Sylvain Bertrand [web]
  • Julien Marzat [web]
  • Hélène Piet-Lahanier

Students

Funded by the project:

  • Quentin Possamaï (Phd student jointly co-supervised by LIRIS / LAGEPP). Subject: stable drone control by machine learning and control theory

Associated students:

  • Edward Beeching (Phd student funded by Inria CORDI-S Scholarship). Subject: Large-scale automatic learning of autonomous agent behavior with structured deep reinforcement learning

Publications (and arXiv pre-prints)

2022

FilteredCoPhy — Un- supervised and Counterfactual Learning of Physical Dynamics

S. Janny, F. Baradel, N. Neverova, M. Nadri, G. Mori, and C. Wolf

ICLR, 2022 (oral, 1.6%)

Learning to estimate UAV created turbulence from scene structure observed by onboard cameras

Q. Possamaï, S. Janny, M. Nadri, L. Bako, and C. Wolf

Pre-print arXiv:2203.14726, 2022 [LINK]

Learning Reduced Nonlinear State-Space Models: an Output-Error Based Canonical Approach

S. Janny, Q. Possamaï, L. Bako, M. Nadri and C. Wolf

Pre-print HAL-03672151, 2022 [LINK]

2021

Deep KKL: Data-driven Output Prediction for Non-Linear Systems

S. Janny, V. Andrieu, M. Nadri, and C. Wolf

CDC, 2021

Reinforcement Learning Policies with local LQR guarantees for Nonlinear Discrete-Time Systems

S. Zoboli, V. Andrieu, D. Astolfi, G. Casadei, J. Dibangoye, and M. Nadri

CDC, 2021

2020

Counterfactual Learning of Physical Dynamics

F. Baradel, N. Neverova, J. Mille, G. Mori, and C. Wolf

ICLR, 2020

Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing

Yuxuan Xie, J. Dibangoye, and Olivier Buffet

ICML, 2020

Supervised Output Regulation via Iterative Learning Control for Rejecting Unknown Periodic Disturbances

OK. Kocan, D. Astolfi, C. Poussot-Vassal, and A. Manecy.

IFAC 2020

Data-driven multi-model control for a waste heat recovery system

J. Peralez, F. Galuppo, P. Dufour, C. Wolf, and M. Nadri

CDC 2020

Deep Learning-based Luenberger observer design for discrete-time nonlinear systems

J. Peralez and M. Nadri

CDC 2021

See all our publications

Methodology

The last years have witnessed the soaring of Machine Learning (ML), which has provided disruptive performance gains in several fields. Apart from undeniable advances in methodology, these gains are often attributed to massive amounts of training data and computing power, which led to breakthroughs in speech recognition, computer vision and natural language processing. In this project, we propose to extend these advances to sequential decision making of multiple agents for planning and control. We particularly target learning realistic behavior with multiple horizons, requiring long-term planning at the same time as short-term fine-grained control. Obtaining gains through learning from massive amounts of data is not as easy as in supervised learning, as learning to control requires to learn from interactions instead of learning from static data collections.

Control Theory (CT) is the second key methodology of this project, together with ML. The two fields, while being distinct, have a long history of interactions between them and as both fields mature, their overlap is more and more evident. CT aims to provide differential model-based approaches to solve stabilization and estimation problems. These model-driven approaches are powerful because they are based on a thorough understanding of the system and can leverage established physical relationships. However, nonlinear models usually need to be simplified, they have difficulty accounting for noisy data and non modeled uncertainties. At some level, they are limited by the complexity (strong nonlinearities, big dimension) of the real systems. For linear differential equations, the theory is both complete and constructive, and we can guarantee that the design will provide the required tools as well as the limits of the models and actuators allow it.

Machine Learning, on the other hand, aims at learning complex models from (often large amounts of) data and can provide data-driven models for a wide range of tasks. Markov Decision Processes (MDP) and Reinforcement Learning (RL) have traditionally provided a mathematically founded framework for control applications, where agents are required to learn policies from past interactions with an environment. In recent years, this methodology has been combined with deep neural networks, which play the role of high-capacity function approximators, and model the discrete or continuous policy function or a function of the cumulated reward of the agent, or both.

This dichotomy between design and learning is typical of AI on its overlap with other domains and is widely discussed in, e.g., computer vision (handcrafted vs. learned feature representations) language (linguistics vs. learning), statistical predictions (handcrafted causal models vs. data driven predictors) etc. While in many applications learning has become the prevailing methodology, process control is still a field where control engineering cannot be replaced for many low level control problems, mainly due to (i) lack of stability of learned controllers, and (ii) computational complexity.

DeLiCio proposes fundamental research on the crossroads of ML/IA and CT with planned algo- rithmic contributions on the integration of models, prior knowledge and learning in control and the perception action cycle: data-driven learning / identification of physical models for control; stability and robustness priors for reinforcement learning; stable decentralized (multi-agent) control using ML and CT. The planned methodological advances of this project will be evaluated on a challenging appli- cation requiring planning as well as fine-grained control, namely the synchronization of a UAV swarm through learning. The objective is to learn strategies which allow a swarm to solve a high-level goal (navigation, visual search) while at the same time maintaining a formation.

Kitten

Organization

DeLiCio will last 4 years. It is structured into 4 interacting work-packages, plus a project management package WP0. WP1 and WP2 address single agent problems and propose different ways of integrating ML and CT. Building on their results, WP3 addresses multi-agent behavior and WP4 provides a challenging application of the scientific advances to the control of UAV fleets.

Kitten

Selected Results

Hybrid system identification with deep learning + control theory (CDC 2021; HAL+under review)

Kitten
Kitten

Observer design and deep learning (CDC 2021)

Kitten

Estimating UAV generated turbulences from onboard vision (arxiv, under review)

Kitten

Machine learning of complex physical phenomena (ICLR 2022)

Kitten

Hybrid control (CDC 2020)

Kitten

Impact

Scientific impact

The scientific results obtained by project DeLiCio are of fundamental nature and concern a large number of potential problems in machine learning, in process control and in robotics. We predict a large impact of the addressed contributions (hybrid control, statistical model reduction, stability and robustness priors etc.) on RL agents, making them (i) more sample efficient and thus to be more widely applicable, and (ii) less computational complex and thus easier to implement in embedded systems.

Impact on society

Although our contributions will be evaluated on the control of UAV fleets, extensions may include industrial and service robotics, ADAS systems, distributed control of multi-agent systems such as fleet of autonomous vehicles/cars (eg. speed consensus of autonomous cars on motorways), communication reductions in networked control systems, and even AI for games, household devices and smart devices. The results of project DeLiCio could bring society a big step further towards smart ambient intelligence.

Open access and open data

We strongly believe that science advances best and quickest through open protocols. Reference code implementations will be made available with each publication. When applicable, we will publish learned models online, including trained parameters of deep networks.

Funding

This project was funded by ANR (Agence National de la Recherche) und call AAPG 2019: ANR ANR-19-CE23-0006.