## PolSys Seminar, LIP6, Paris, Sep 27th 2024

https://www-polsys.lip6.fr/Seminar/seminar.html Conservation Laws for Gradient Flows

## Conservation laws for neural network training: two papers at @NeurIPS23 & @ICML24

We study conservation laws during the (euclidean or not) gradient or momentum flow of neural networks.

Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows, accepted at ICML24

1/ We define the **concept of conservation laws for momentum flows** and show how to extend the framework from our previous paper (Abide by the Law and Follow the Flaw: Conservation Laws for Gradient Flows, oral @NeurIPS23)** **for non-Euclidean gradient flow (GF) and momentum flow (MF) settings. **In stark contrast to the case of GF, conservation laws for MF exhibit temporal dependence**.

2/ We discover **new conservation laws** for linear networks in the Euclidean momentum case, and these new laws are complete. In contrast, **there is no conservation law for ReLU networks in the Euclidean momentum case**.

3/ **In a non-Euclidean context**, such as in NMF or for ICNN implemented with two-layer ReLU networks, **we discover new conservation laws for gradient flows** and find none in the momentum case. We obtain n**ew conservation laws in the Natural Gradient Flow case**.

4/ We shed light on a quasi-systematic loss of conservation when transitioning from the GF to the MF setting.

## Invited talk @IEM @EPFL, May 24th 2024

Frugality in machine learning: Sparsity, a value for the future?

Sparse vectors and sparse matrices play a transerve role in signal and image processing: they have led to succesful approaches efficiently addressing tasks as diverse as data compression, fast transforms, signal denoising and source separation, or more generally inverse problems. To what extent can the potential of sparsity be also leveraged to achieve more frugal (deep) learning techniques? Through an overview of recent explorations around this theme, I will compare and contrast classical sparse regularization for inverse problems with its natural extensions that aim at learning neural networks with sparse connections. During our journey, I will notably highlight the role of rescaling-invariances of modern deep parameterizations, which come with their curses and blessings.

## Invited talk, MAP5, Paris, May 17th 2024

Frugality in machine learning: Sparsity, a value for the future?

Sparse vectors and sparse matrices play a transerve role in signal and image processing: they have led to succesful approaches efficiently addressing tasks as diverse as data compression, fast transforms, signal denoising and source separation, or more generally inverse problems. To what extent can the potential of sparsity be also leveraged to achieve more frugal (deep) learning techniques? Through an overview of recent explorations around this theme, I will compare and contrast classical sparse regularization for inverse problems with its natural extensions that aim at learning neural networks with sparse connections. During our journey, I will notably highlight the role of rescaling-invariances of modern deep parameterizations, which come with their curses and blessings.

## Invited talk, Math Machine Learning seminar MPI MIS + UCLA, online, May 16, 2024

‘Conservation Laws for Gradient Flows’

Understanding the geometric properties of gradient descent dynamics is

a key ingredient in deciphering the recent success of very large

machine learning models. A striking observation is that trained

over-parameterized models retain some properties of the optimization

initialization. This “implicit bias” is believed to be responsible for

some favorable properties of the trained models and could explain

their good generalization properties. In this work, we expose the

definitions and properties of “conservation laws”, that define

quantities conserved during gradient flows of a given machine learning

model, such as a ReLU network, with any training data and any loss.

After explaining how to find the maximal number of independent

conservation laws via Lie algebra computations, we provide algorithms

to compute a family of polynomial laws, as well as to compute the

number of (not necessarily polynomial) conservation laws. We obtain

that on a number of architecture there are no more laws than the known

ones, and we identify new laws for certain flows with momentum and/or

non-Euclidean geometries.

Joint work with Sibylle Marcotte and Gabriel Peyré.

## Journées SMAI-MODE 2024, Lyon, 25-29 mars 2024

Details and registration https://indico.math.cnrs.fr/event/9418/

## Journée SIGMA – MODE 2024, 30 janvier 2024, Inria Paris.

More information and (free but mandatory) registration at:

http://angkor.univ-mlv.fr/~vialard/conferences/sigmamode/