Overview
This workshop brings together researchers and PhD students to exchange ideas and stimulate collaborations on optimization methods for machine learning and embedded control. The 3-day event features invited talks from leading experts and poster presentations highlighting innovative work by PhD students.
Speakers
Stephen Boyd
Stanford University, USA
Learning Parametrized Convex Functions
TBD

Volkan Cevher
EPFL, Switzerland
Training Deep Learning Models with Norm-Constrained LMOs
In this work, we study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball. We propose a new stochastic family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems. The resulting update rule unifies several existing optimization methods under a single framework. Furthermore, we propose an explicit choice of norm for deep architectures, which, as a side benefit, leads to the transferability of hyperparameters across model sizes. Experimentally, we demonstrate significant speedups on nanoGPT training without any reliance on Adam. The proposed method is memory-efficient, requiring only one set of model weights and one set of gradients, which can be stored in half-precision.

Moritz Diehl
University of Freiburg, Germany
Numerical Optimal Control for Nonsmooth Dynamic Systems
This talk presents recent progress in the field of optimal control of nonsmooth dynamic systems, motivated by applications in robotics. The considered systems are characterized by a special type of ordinary differential equations (differential inclusions) that exhibit not only state dependent switches but even state dependent state jumps due to contacts and friction. Crucially, the nonsmoothness can be encoded in the form of complementarity systems that result from the solution of convex optimization problems that depend on the system state. We discuss several techniques - with either fixed or variable time-steps - to transcribe the continuous time problem into a finite dimensional mathematical program with complementarity constraints (MPCC).
Finally, we present tailored numerical methods to efficiently solve the resulting nonlinear MPCC problems that include problem functions based on lower level convex solvers, and application results in simulations and real-world experiments with an assembly robot at Siemens research in Munich.
This talk presents joint work with Armin Nurkanovic, Anton Pozharskiy, Christian Dietz and Sebastian Albrecht.

Mikael Johansson
KTH Royal Institute of Technology, Sweden
From core to clusters: tailoring optimization algorithms to modern compute architectures.
Optimization algorithms power applications from embedded control systems to large-scale machine learning. However, the rapid diversification of hardware requires algorithm designs that move beyond abstract complexity measures to an explicit hardware-awareness. This talk presents our efforts toward hardware-aware optimization methods tailored specifically to the architectural constraints and opportunities inherent in embedded systems, GPU accelerators, and decentralized compute clusters.
For compute accelerators, we develop a GPU-optimized Douglas-Rachford splitting algorithm for optimal transport problems. By carefully reorganizing the computational workflow to align with GPU warp execution patterns and optimize memory coalescence, our method attains order-of-magnitude speedups compared to traditional implementations, all while maintaining rigorous convergence guarantees.
For distributed environments, we address the fundamental challenge of making decentralized training outperform AllReduce-based methods in practical high-performance computing settings. Our approach introduces a novel communication-computation interleaving strategy combined with a specialized Adam optimizer that mitigates the effects of small local batch sizes. By carefully analyzing the interplay between topology design, communication patterns, and optimizer behavior, we demonstrate both reduced wall-clock training times and improved generalization performance on DNN training.
Across these computing paradigms, we show how optimization algorithms can preserve theoretical rigor while exploiting hardware-specific characteristics to achieve significant performance gains.

Ion Necoara
University Politehnica Bucharest, Romania
Linearized augmented Lagrangian methods for optimization problems with nonlinear equality constraints
We consider (nonsmooth) nonconvex optimization problems with nonlinear equality constraints. We assume that (some part of) the objective function and the functional constraints are locally smooth. For solving this problem, we propose linearized augmented Lagrangian methods, i.e., we linearize the objective function and the functional constraints in a Gauss-Newton fashion at the current iterate within the augmented Lagrangian function and add a quadratic regularization, yielding a (convex) subproblem that is easy to solve, and whose solution is the next primal iterate. The update of the dual multipliers is also based on the linearization of functional constraints. Under some specific dynamic regularization parameter choices, we prove boundedness and global asymptotic convergence of the iterates to a first-order solution of the problem. We also derive convergence guarantees for the iterates to an ϵ-first-order solution.

Yurii Nesterov
Center for Operations Research at Corvinus University of Budapest, Hungary, and
School of Data Science at the Chinese University of Hong Kong, Hong Kong SAR
Asymmetric Long-Step Primal-Dual Interior-Point Methods with Dual Centering
We discuss a new way of development of asymmetric Interior-Point Methods for solving primal-dual problems of Conic Optimization. It is very efficient for problems, where the dual formulation is simpler than the primal one. The problems of this type arise often in Semidefinite Optimization (SDO), for which we propose a new primal-dual method with very attractive computational cost. In this approach, we do not need sophisticated Linear Algebra, restricting ourselves by standard Cholesky factorization. However, our complexity bounds correspond to the best-known polynomial-time results. Moreover, for symmetric cones the bounds automatically depend on the minimal barrier parameter between the primal and the dual feasible sets. We show by SDO-examples that the corresponding gain can be very big. We discuss some classes of SDO-problems, where the complexity bounds are proportional to the square root of the number of linear equality constraints and the computational cost of one iteration is as in Linear Optimization. Our theoretical developments are supported by encouraging numerical testing.

Giuseppe Notarstefano
University of Bologna, Italy
System Theory Tools for Optimization in Learning and Control
Optimization is a fundamental tool to solve several control and learning tasks, but optimization algorithms are often considered as building blocks with their convergence guarantees. In this talk I will present a different perspective in which optimization algorithms are considered as discrete-time dynamical systems that can be combined or integrated as suitable subsystems into a closed-loop scheme. This opens up the possibility to use system theory tools to analyze their (convergence) properties and also gain key insights for the design of novel and possibly advanced schemes. This is particularly interesting for complex scenarios involving large-scale and distributed systems or online schemes involving simultaneously optimization, learning and control. In particular, timescale separation will be proposed as a tool to frame accelerated and model-free optimization algorithms as well as to design schemes for distributed computing. Finally, scenarios combining optimization with learning and control schemes will be shown.

Panos Patrinos
KU Leuven, Belgium
Nonlinearly Preconditioned Gradient Methods under Generalized Smoothness
We analyze nonlinearly preconditioned gradient methods for solving smooth minimiza- tion problems. We introduce a generalized smoothness property, based on the notion of abstract convexity, that is broader than Lipschitz smoothness and provide sufficient first- and second-order conditions. Notably, our framework encapsulates algorithms associated with the clipping gradient method and brings out novel insights for the class of (L0,L1)- smooth functions that has received widespread interest recently, thus allowing us to go beyond already established methods. We investigate the convergence of the proposed method in both the convex and nonconvex setting.

Program
9:00 – 9:30 | Registration |
9:30 – 10:30 | Invited Talk 1 – Volkan Cevher |
10:30 – 11:30 | Coffee Break |
11:30 – 12:30 | Invited Talk 2 – Mikael Johansson |
12:30 – 15:00 | Lunch Break |
15:00 – 16:00 | Invited Talk 3 – Ion Necoara |
16:00 – 18:00 | Coffee Break and Poster Presentations |
9:30 – 10:30 | Invited Talk 4 – Stephen Boyd |
10:30 – 11:30 | Coffee Break |
11:30 – 12:30 | Invited Talk 5 – Yurii Nesterov |
12:30 – 15:00 | Lunch Break |
15:00 – 16:00 | Invited Talk 6 – Panos Patrinos |
16:00 – 18:00 | Coffee Break and Poster Presentations |
9:30 – 10:30 | Invited Talk 7 – Giuseppe Notarstefano |
10:30 – 11:30 | Coffee Break |
11:30 – 12:30 | Invited Talk 8 – Moritz Diehl |
12:30 – 15:00 | Lunch |
Poster Presentations
Samuel Erickson Andersson | Personalized Federated Learning under Model Dissimilarity Constraints |
Adeyemi D. Adeoye | SCORE: Approximating Curvature Information under Self-Concordant Regularization |
Riccardo Brumali | Data-Driven Distributed Optimization via Aggregative Tracking and Deep-Learning |
Amir Daghestani | Byzantine-Robust Federated Learning with Learnable Aggregation Weights |
Brecht Evens | Progressive Decoupling of Linkage Problems Beyond Elicitable Monotonicity |
Matteo Facchino | Tracking MPC Tuning in Continuous Time: A First-Order Approximation of Economic MPC |
Nick Korbit | Scalable Gauss–Newton Methods for Training Deep Neural Networks |
Pablo Krupa | Restart of Accelerated First-Order Methods with Linear Convergence Under a Quadratic Functional Growth Condition |
Sampath Kumar Mulagaleti | System identification with controller-synthesis guarantees |
Pieter Pas | Parallelization and Vectorization of a Structure-Exploiting Solver for Optimal Control |
Alice Rosetti | Multi-robot safe cooperation via combined predictive filters and constraint compression |
Alberto Zaupa | Accelerating ADMM for embedded MPC on modern hardware |
Kui Xie | Online Design of Experiments by Active Learning for System Identification of Autoregressive Models |
Zesen Wang | From Promise to Practice: Realizing High-performance Decentralized Training |
Venue
Aula Guinigi, IMT School for Advanced Studies Lucca
Piazza San Francesco, 19, Lucca, 55100 Lucca LU, Italy
Travel
From Pisa Airport (PSA) to Lucca
The easiest way to reach Lucca from Pisa Airport is by train.
- Take the PisaMover: From the airport terminal, take the PisaMover shuttle train to Pisa Centrale Station. The journey takes about 5 minutes, and departures are frequent.
- Train to Lucca: From Pisa Centrale, take a regional train to Lucca. The journey takes approximately 25-30 minutes, and trains run regularly.
You can purchase train tickets at Pisa Centrale or online via the Trenitalia website.
From Florence Airport (FLR) to Lucca
To travel from Florence Airport to Lucca:
- Tram to Florence Station: Take the T2 tram line from Florence Airport to Firenze Santa Maria Novella (SMN) train station. The tram ride is about 20 minutes.
- Train to Lucca: From Firenze SMN, take a regional train to Lucca. The journey typically takes 1 hour 15 minutes to 1 hour 45 minutes, depending on the specific train.
Tickets can be bought at the airport, at Florence SMN, or online.
Getting to IMT Lucca from Lucca Train Station
IMT Lucca is conveniently located within a short, walkable distance from Lucca Train Station. The walk is approximately 15-20 minutes and takes you through the charming streets of Lucca.
Upon exiting the train station, follow the signs towards the city center (centro storico). Continue straight along Via del Pallone, then turn left onto Via Santa Croce. Piazza San Francesco will be on your right.
Organizing Committee
- Alberto Bemporad (IMT School for Advanced Studies Lucca, Italy)
- Mario Zanon (IMT School for Advanced Studies Lucca, Italy)
- Moritz Diehl (University of Freiburg, Germany)
- Panagiotis Patrinos (KU Leuven, Belgium)
- Puya Latafat (IMT School for Advanced Studies Lucca, Italy)
Contact
For inquiries, please email: eventi@imtlucca.it