Schools CEA - EDF - INRIA

École analyse numérique de 2026



École analyse numérique de 2026: contenu, programme, dates, informations pratiques.

École d’été d’analyse numérique 2026

Optimisation distribuée. Application aux systèmes énergétiques de demain.

Contexte scientifique

Contenu

Orateurs/-trices

• Francis Bach (Inria)
• Claire Monteleoni (INRIA)
• Gilles Stoltz (CNRS)
• Michaël Jordan (Univ. Of California, Berkeley)
• Olivier Wintenberger (LPSM/Sorbonne Université)
• Emmanuel Rachelson (ISAE-SUPAERO)
• Vianney Perchet (ENSAE/CREST)
• Paul Strang (EDF)
• Laurent Pfeifer (Inria)
• Aymeric Dieuleveut (CMAP/Ecole Polytechnique)

Programme prévisionnel

  • Lundi 22 juin 2026
    • 9h-9h45, Ouverture
    • 10h-11h, Bandits (Stoltz, Brégère)
    • 11h00-11h15, Coffee Break
    • 11h15-12h15, Bandits (Stoltz, Brégère)
    • 12h15-14h, Lunch Break
    • 14h-16h,Bandits (Stoltz, Brégère)
    • 16h-16h15, Break
    • 16h15-17h, Bandits (Stoltz, Brégère)
    • 17h-18h, Bandits Recherche (Perchet)
  • Mardi 23 juin 2026
    • 9h-(15min Break)-12h, Optimisation (Bach)
    • 12h15-14h, Lunch Break
    • 14h-16h, Optimisation TP (Dubois-Taine)
    • 16h-16h30, Break
    • 16h30-17h30, Online Optimisation (Wintenberger)
  • Mercredi 24 juin 2026
    • 9h-(15min Break)-11h45, Climat (Monteleoni)
    • 12h-13h, AI for economics(Jordan)
    • 13h-17h30, Social Activity
  • Jeudi 25 juin 2026
    • 9h-(15min Break)-12h15, RL (Rachelson)
    • 12h15-14h, Lunch Break
    • 14h-16h, RL TP (Rachelson)
    • 16h-16h30, Break
    • 16h30-17h30, RL Recherche (Strang)
  • Vendredi 26 juin 2026
    • 9h-(15min Break)-12h15, Optim distribuée (Pfeiffer)
    • 12h15-14h, Lunch Break

Abstract

Olvier Wintenberger – Stochastic Online Convex Optimization with Applications to Adaptive Forecasting
We propose a general framework for stochastic online convex optimization, embedding aggregation of experts and probabilistic forecasting.
Certain algorithms, including Bernstein Online Aggregation and Online Newton Steps, achieve the optimal rates in stochastic environnements.
We apply our framework to calibrating parametric probabilistic forecasters of non-stationary conditionally sub-Gaussian time series.
We illustrate the benefit of our approach to real-world data such as electricity load and temperatures.

Paul Strang – Model based-reinforcement learning for exact combinatorial optimization
Modern mixed-integer linear programming (MILP) solvers are built upon the branch-and-bound (B&B) paradigm. Since the 1980s, considerable research and engineering effort has gone into refining these solvers, resulting in highly optimized systems driven by expert-designed heuristics tuned over large benchmarks. However, in operational settings where structurally similarproblems are solved repeatedly, adapting solver heuristics to the distribution of encountered MILPs can lead to substantial gains in efficiency, beyond what static, hand-crafted heuristics can offer. Recent research has thus turned to machine learning (ML) to design efficient, data-driven B&B heuristics tailored to specific instance distributions. Here, we propose adapting recent reinforcement learning (RL) approaches, known for having achieved breakthroughs in complex combinatorial games such as chess or go, to the setting of exact combinatorial optimization. Drawing on the MuZero architecture (Schrittwieser et al.), we introduce Plan-and-Branch-and-Bound (PlanB&B), a model-based reinforcement learning (MBRL) agent that leverages an internal model of the B&B dynamics to learn improved variable selection strategies.

Gilles Stoltz – A Tutorial on Stochastic Bandits, with Applications to the Management of Electricity Consumption
Abstract: Stochastic Bandits form a simple case of online learning, where at each round, a learner must select an action and obtains and observes a stochastic reward depending on this action, but does not observe the rewards that would have been achieved with other actions available. The rewards are generated according to some unknown stochastic model. The goal of the learner is to maximize the cumulative reward or, equivalently, minimize a quantity called the regret, defined as the difference between the cumulative reward achieved by the best action and the cumulative reward actually achieved.
The first part of this series of lectures is devoted to vanilla K-armed bandits, which correspond to facing K actions, each associated with a fixed but unknown distribution. We review simple strategies like ETC (explore then commit) and UCB (upper confiance bound): we formally define them, state regret bounds, and provide proofs thereof. The UCB strategy (Auer, Cesa-Bianchi, Fisher, 2002) is an optimistic strategy building upper confidence bounds on the expectations of the distributions associated with arms and picking at each round the action associated with the largest such upper confidence bound. Thompson sampling (Thompson, 1933; Chapelle and Li, 2011; Scott, 2010), a Bayesian-sampling-based strategy will also be studied, if time permits.
The second part considers an extension where actions lie in a continuous set, typically a subset of \( \mathbb{R}^d \), and where rewards are, in expectation, linear functions of the actions picked. We review the classic LinUCB strategy (Abbasi-Yadkori, Pal, Szepesvari, 2011), which builds sequentially confidence regions over the unknown linear parameter of the reward function, from which it derives confidence regions over rewards functions. The actions are again picked in an optimitic fashion, by focusing on the largest values compatible with the confidence regions. We will finally explain how finite-armed contextual bandits, where rewards depend on a (possibly continuous) context and on the arm picked in a finite set, may be handled in a similar way.
The third and final part presents an adaption of the linear setting designed to manage the electricity consumption by sending tariff incentives to customers (Bregere, Gaillard, Goude, Stoltz, 2019). This setting corresponds to contextual bandits with finitely many actions given by tariff levels, but where instead of picking one tariff the learner rather picks shares of the population receiving each tariff level. Imposing higher tariffs reduces the consumption and offering lower tariffs increases it.
Each part will be a mix of theory and practice, through notebooks in Python. In particular, we will first simulate the vanilla K-armed bandits problem to compare ETC, UCB and possibly other strategies. Then, we will consider electrical demand data (gathered by UK Power Networks) in which price incentives were offered to design a stochastic bandit algorithm for demand side management.

Vianney Perchet – Learning to Compete: Auction Design and Bidding Strategies in Energy Markets
Abstract: The transition to renewable energy has made electricity markets more dynamic and competitive than ever. Central to these markets are repeated auctions, where producers must constantly adjust their bids to account for fluctuating demand and the strategic moves of their rivals. This talk explores how we can use « online learning » to understand and optimize behavior in these high-stakes environments.
I will discuss two key pillars of our recent research. First, I will present our work from NeurIPS 2024 on uniform-price auctions, where we developed a new mathematical framework that allows bidders to learn optimal strategies much faster by better modeling the available « bid space. » Second, I will touch upon our comparative work (with Marius Potfer) examining the two giants of auction design: uniform-price vs. discriminatory (pay-as-bid) auctions. We compare these formats through the lens of « regret minimization »—measuring how much a bidder loses by not knowing the future. By bridging theoretical machine learning with the practical realities of energy markets, we provide new insights into which auction structures lead to more efficient outcomes and how participants can best navigate them

Francis Bach – Learning theory from first principles
Summary : Data have become ubiquitous in science, engineering, industry, and personal life, leading to the need for automated processing. Machine learning is concerned with making predictions from training examples and is used in all of these areas, in small and large problems, with a variety of learning models, ranging from simple linear models to deep neural networks. It has now become an important part of the algorithmic toolbox.
How can we make sense of these practical successes? Can we extract a few principles to understand current learning methods and guide the design of new techniques for new applications or to adapt to new computational environments? This is precisely the goal of learning theory and this series of lectures, with a particular eye toward adaptivity to specific structures that make learning faster (such as smoothness of the prediction functions or dependence on low-dimensional subspaces).

Claire Monteleoni – AI for Climate
Abstract: The stunning recent advances in AI content generation rely on cutting-edge, generative deep learning algorithms and architectures trained on massive amounts of text, image, and video data. With different training data, these algorithms and architectures can benefit a variety of applications for addressing climate change. As opposed to text and video, the relevant training data includes weather and climate data from observations, reanalyses, and even physical simulations.
Using AI methods — especially generative AI — in climate and weather provides additional sources of uncertainty beyond those already identified. However the potential for such methods is great. Generative AI methods can address fundamental tasks including data fusion, interpolation, downscaling, and domain alignment. This course will provide a survey of recent work applying such methods to problems including weather forecasting, including extreme events, climate model emulation and scenario generation, and renewable energy planning.

Michael Jordan –Contracts, Uncertainty, and Incentives in Machine Learning Ecosystems
Abstract: Contract theory is the study of economic incentives when parties transact in the presence of private information, and where prior distributions may not be common knowledge. We augment classical contract theory to incorporate a role for learning from data, where the overall goal of the adaptive mechanism is to obtain desired statistical behavior. We consider applications of this framework to problems in principal-agent regulatory mechanisms. We also consider systems in which data is a valued good, and principals and agents are arranged in markets consisting of multiple layers.

Emmanuel Rachelson – A Brief Introduction to Reinforcement Learning
Abstract: Making a sequence of optimal decisions to control a system — whether winning a video game, balancing a bicycle, or managing the electricity output of a network of power plants — falls under the domain of optimal control. Reinforcement learning (RL) asks the question: « Can we learn an optimal control strategy directly through interaction with the system, without prior knowledge of its dynamics or properties? » This question is rooted in human experience: we do not inspect a video game’s source code or write down the physics equations of a bicycle, yet we can achieve precise control, based on past experience and trial-and-error interaction with the system.
RL seeks to solve the optimal control problem using interaction data. This brings together challenges in optimization (of the control function), exploration (to gather system dynamics data), and estimation and approximation (of the objective and the control function), connecting RL to related areas such as optimal control, online optimization, stochastic approximation, or supervised learning.
This course introduces the formal foundations of reinforcement learning and its key problems, using modern terminology that provides a direct pathway to deep reinforcement learning methods, which currently dominate the field.

Laurent Pfeiffer – Decomposition methods
The course will focus on large-scale optimization problems of the form:
\[ \inf_{x \in \{x_1, …, x_N\}} \sum_{i=1}^N f_i(x_i) + g(\sum_{i=1}^N A_i x_i) \] where the functions \( f_i, g\) are supposed to be convex and valued in \( \mathbb{R} \cup {+\infty} \). As an example of application, one may think to the case of an energy management problem involving \(N\) production units and associated decisions \(x_i\), coupled through the term \( g(\sum_{i=1}^N A_i x_i) \), modelling for example a constraint on the total production.
These problems enjoy a separability structure, which is in particular visible at the level of the dual problem, whose associated cost is a sum of \(N +1\) functions. This structure allows for the design of decomposition methods: iterative numerical methods that scale well for large values of \(N \), in so far as each of their iteration requires to realize N parallelizable operations.
The course will review and analyze several decomposition methods, relying on various regularity assumptions and oracles on the data functions. They include in particular the Frank-Wolfe algorithm, the dual subgradient algorithm, and the primal-dual hybrid gradient algorithm. The course will be completed by a presentation of stochastic variants of these algorithms, involving a sampling of the units at each iteration.

Informations pratiques

Dates

22 Juin – 26 Juin 2026

Lieu

CEA Liten
Le Bourget du Lac

Contacts

Organisateurs
Yannig Goude
Côme Bissuel
Pierre Gaillard
Mathieu Vallée
Grégoire Pichenot

Inscription

Pour pouvoir participer, merci de remplir le formulaire d’inscription Word icon et l’envoyer avant le 15 mai 2026 à Régis Vizet et Tifenn Baril-Graffin.

Contacts

Sécretariat des écoles
Régis Vizet – CEA
Tifenn Baril-Graffin – INRIA
tel: 01 69 26 47 45
Fax: 01 69 26 70 05

Pré-requis

Schools CEA - EDF - INRIA
Résumé de la politique de confidentialité

Ce site utilise des cookies afin que nous puissions vous fournir la meilleure expérience utilisateur possible. Les informations sur les cookies sont stockées dans votre navigateur et remplissent des fonctions telles que vous reconnaître lorsque vous revenez sur notre site Web et aider notre équipe à comprendre les sections du site que vous trouvez les plus intéressantes et utiles.