Jose Gallego-Posada

About

I completed my PhD at Mila and the University of Montréal supervised by Simon Lacoste-Julien. My doctoral research was partially supported by an IVADO PhD Excellence Scholarship.

Before joining Mila, I completed my MSc in Artificial Intelligence at the University of Amsterdam in 2018, supervised by Patrick Forré. I hold a BSc in Mathematical Engineering from Universidad EAFIT in Medellin.

I study how to solve constrained optimization problems involving neural networks. I have also worked in adaptive optimization, equivariant deep learning, geometric information theory, federated learning, and applications of algebraic geometry to machine learning.

I am one of the lead developers of Cooper, an open-source library for non-convex constrained optimization in PyTorch.

My CV is available here.

My name is pronounced Xose Gaʝego Posada. My Dijkstra and Erdős numbers are 4.

News

2025

May 2: I will be attending AISTATS 2025 in Phuket, Thailand.
Apr 1: We have released version 1.0.0 of our Cooper library for constrained optimization in deep learning has been released! 🎉
Mar 31: I am giving an invited lecture on Donsker's Theorem at Universidad EAFIT. You can find my hand-written notes here.
Jan 25: Our work Feasible Learning, introducing a new sample-centric learning paradigm, has been accepted at AISTATS 2025.

2024

Nov 1: I gave a keynote address on perspectives on AI development within the Colombian economy during the Ser Empresarial symposium at Universidad Pontificia Bolivariana.
Aug 17: I have successfully defended my PhD thesis Constrained Optimization for Machine Learning: Algorithms and Applications at Mila and the University of Montréal. 🎉
July 21: I gave a keynote talk on Constrained Optimization for Machine Learning at the LatinX in AI Workshop at ICML 2024.
July 20: I will be attending ICML 2024 in Vienna, Austria.
May 6: I will be attending ICLR 2024 in Vienna, Austria.
May 1: Our work On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization has been accepted at ICML 2024.
Mar 20: I gave an invited talk on updates for Lagrange multipliers inspired on PID-controllers at the Montréal Machine Learning and Optimization group hosted by Microsoft Montréal. You can check out the recording here.
Jan 16: My first last-author paper (i.e. supervisory role), Balancing Act: Constraining Disparate Impact in Sparse Models, has been accepted at ICLR 2024! 🎉

2023

Sep 27: I will be giving an invited talk on Distributed Shampoo in PyTorch at the 59th Allerton Conference in Monticello, IL, USA.
Apr 5: I gave a talk (watch here!) on training sparse neural networks with constrained optimization at the One World Seminar Series on the Mathematics of Machine Learning.
Mar 15: Thrilled to be serving as co-General Chair for the LatinX in AI workshop at ICML 2023! For more details on the calls for papers, reviewers or volunteers, please visit the LatinX@ICML2023 website.
Mar 5: I will be attending Khipu 2023 in in Montevideo, Uruguay.
Jan 9: I have started a new role as a visiting researcher at Meta working on scalable adaptive optimization methods with Mike Rabbat!

2022

Dec 2: I will be presenting Cooper, our open-source library for constrained optimization in PyTorch as a poster at the first ever PyTorch Conference, co-located with NeurIPS 2022.
Oct 22: Our report on the opportunities and challenges of using AI towards supporting the development of sustainable cities is now available online. This report is the result of a collaboration between Mila and the United Nations 🇺🇳 Human Settlements Programme (UN-Habitat).
Sep 14: I will be presenting Controlled Sparsity via Constrained Optimization at NeurIPS 2022. See you in New Orleans!
Aug 27: Equivariant Mesh Attention Networks has been accepted for publication at TMLR!
Aug 8: Very excited to release the preprint for Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints.
July 8: I will be presenting our L₀onie: Compressing COINs with L₀-constraints at the Sparsity in Neural Networks workshop. Checkout our poster here!
May 21: The preprint for Equivariant Mesh Attention Networks is now available on arXiv. Our code is available here.
Apr 25: I'll be spending this Summer at Qualcomm Amsterdam working with Matthias Reisser and Christos Louizos on sparsity and federated learning.
Mar 15: We have released Cooper: a library for Lagrangian-based constrained optimization in Pytorch.
Jan 3: I will be TAing Ioannis Mitliagkas' graduate course on Deep Learning Theory at Mila for the third time!.

2021

Dec 17: Flexible Learning of Sparse Neural Networks via Constrained L0 Regularization received the Best Poster Award at the LatinX in AI Workshop at NeurIPS2021!
Oct 22: Flexible Learning of Sparse Neural Networks via Constrained L0 Regularization has been accepted at the LatinX in AI Workshop at NeurIPS2021.
Sep 23: I have been awarded a Prix d’excellence en enseignement (Excellence in Teaching Award) by the University of Montréal. Announcement by UdeM (in French).
Sep 7: I will be a co-chair for the Program Committee of the LatinX in AI Workshop at NeurIPS 2021. The workshop will take place on December 6, 2021.
Jun 15: I will attend (virtually) the London Geometry and Machine Learning Summer School 2021 between 12-16 July.
May 21: I have been awarded an IVADO PhD Excellence Scholarship.
May 3: I'm working with Ioannis Mitliagkas as a content creator for the lectures on optimization of Neuromatch Academy's Deep Learning course. You can find our interactive notebook tutorial here.
Apr 19: Check out the slides for my introductory talk on Determinantal Point Processes at Mila's Deep Learning Theory Reading Group and as a guest lecture at Ioannis Mitliagkas' course on DL Theory.
Apr 2: Simplicial Regularization has been accepted at the ICLR2021 Workshop on Geometrical and Topological Representation Learning.
Jan 7: I am TAing Ioannis Mitliagkas' graduate course on Deep Learning Theory at Mila.

2020

Dec 20: I have received a Doctoral Research Microsoft/Mila Diversity Award.
Oct 31: How to make your optimizer generalize better has been accepted as a contributed talk at the NeurIPS2020 OPT Workshop on Optimization for Machine Learning.
Sep 15: Along with Manuela Girotti and Ioannis Mitliagkas, I am co-organizer of the Job Market Talks seminar at Mila. We aim to provide the members of the Mila community with valuable information about life after their graduate degree, both in the the academic and industrial job markets.
Sep 1: I will TA Simon Lacoste-Julien's graduate course on Probabilistic Graphical Models at Mila for the second time. [Slides for guest lecture on Bayesian Non-Parametrics].
Aug 13: I completed my pre-doctoral presentation and officially became a PhD candidate! 🎉 Follow the links to my research proposal and slides.
Aug 3: I have been elected a student representative at Mila. Along with my fellow lab reps I will work hard to enhance Mila's student environment, and help it remain one of the best academic labs to do deep learning research in the world.
Jun 11: New preprint on arXiv studying how the generalization of overparameterized models is impacted by the geometry of certain data-dependent subspaces.
Jun 1: I am interning with Markus Nagel at Qualcomm Research in Amsterdam.
Jan 6: GAIT: A Geometric Approach to Information Theory has been accepted at AISTATS 2020!
Jan 3: I will be TAing Ioannis Mitliagkas' graduate course on Deep Learning Theory at Mila.

2019

Dec 8: [Video] I will present GAIT, our latest work on geometry aware information theory, as a contributed talk at the NeurIPS 2019 Workshop on Information Theory and Machine Learning on Dec 13.
Oct 15: This week I will attend the Workshop on Theory of Deep Learning: Where next? at the Institute for Advanced Study at Princeton, NJ.
Sep 6: Our work on geometry aware information theory will be presented as a poster at the Montréal AI Symposium.
Sep 3: I will TA Simon Lacoste-Julien's graduate course on Probabilistic Graphical Models at Mila.
July 23: I will be spending the next two weeks in Edmonton taking part of the CIFAR 2019 Deep Learning and Reinforcement Learning Summer School.

2018

Dec 3: I will be attending my first ever NeurIPS in Montréal this week!
Sept 1: I joined Mila, one of the world's largest academic labs working in DL, as a PhD student under Simon Lacoste-Julien's supervision.
Aug 24: I successfully defended my MSc thesis at the University of Amsterdam on Simplicial Autoencoders, with the invaluable guidance of Patrick Forré! 🎉

Publications

Cooper: A Library for Constrained Optimization in Deep Learning. J. Gallego-Posada, J. Ramirez, M. Hashemizadeh,S. Lacoste-Julien. arXiv preprint, 2025. [💿 code]
Feasible Learning. J. Ramirez, I. Hounie, J. Elenter, J. Gallego-Posada, M. Hashemizadeh, A. Ribeiro, S. Lacoste-Julien. AISTATS, 2025.
Constrained Optimization for Machine Learning: Algorithms and Applications. J. Gallego-Posada. PhD Thesis, 2024. [📽️ slides]
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization. M. Sohrabi, J. Ramirez, TH. Zhang, S. Lacoste-Julien and J. Gallego-Posada. ICML, 2024. [🗣️ talk - 💿 code]
Balancing Act: Constraining Disparate Impact in Sparse Models. M. Hashemizadeh, J. Ramirez, R. Sukumaran, G. Farnadi, S. Lacoste-Julien and J. Gallego-Posada. ICLR, 2024. [💿 code]
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale. H-J. M. Shi, T.-H. Lee, S. Iwasaki, J. Gallego-Posada, Z. Li, K. Rangadurai, D. Mudigere and M. Rabbat. arXiv preprint, 2023. [💿 code - 📽️ slides]
Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints. J. Gallego-Posada, J. Ramirez, A. Erraqabi, Y. Bengio and S. Lacoste-Julien. NeurIPS, 2022. [🗣️ talk - 💿 code ]
L₀onie: Compressing COINs with L₀-constraints. J. Ramirez and J. Gallego-Posada. Sparsity in Neural Networks Workshop, 2022.
Equivariant Mesh Attention Networks. S. Basu, J. Gallego-Posada, F. Viganò, J. Rowbottom and T. Cohen. TMLR, 2022.
Flexible Learning of Sparse Neural Networks via Constrained L0 Regularization. J. Gallego-Posada, J. Ramirez and A. Erraqabi. NeurIPS 2021 LatinX in AI Workshop, 2021.
Simplicial Regularization. J. Gallego-Posada and P. Forré. ICLR 2021 Workshop on Geometrical and Topological Representation Learning, 2021.
How to make your optimizer generalize better. S. Vaswani, R. Babanezhad, J. Gallego-Posada, A. Mishkin, S. Lacoste-Julien and N. Le Roux. Contributed talk at NeurIPS 2020 OPT Workshop on Optimization for Machine Learning, 2020. -- Previous version: To Each Optimizer a Norm, To Each Norm its Generalization.
GAIT: A Geometric Approach to Information Theory. J. Gallego-Posada, A. Vani, M. Schwarzer and S. Lacoste-Julien. AISTATS 2020 (Previous version presented as an oral at NeurIPS 2019 Workshop on Information Theory and Machine Learning) [🗣️ talk]
Simplicial AutoEncoders: A connection between Algebraic Topology and Probabilistic Modelling. J. Gallego-Posada. MSc Thesis, 2018.
Beyond Local Nash Equilibria for Adversarial Network. F. Oliehoek, R. Savani, J. Gallego-Posada, E. van der Pol and R. Groß. Benelearn, 2018.
Detection and Diagnosis of Breast Tumors using Deep Convolutional Neural Networks. J. Gallego-Posada, D. Montoya, and O. Quintero. Proceedings of the XVII Latin American Conference on Automatic Control, Universidad EAFIT, 2016, pp. 11–17.
Interval Analysis and Optimization Applied to Parameter Estimation under Uncertainty. J. Gallego-Posada and M. Puerta. Boletim da Sociedade Paranaense de Matemática, vol. 36, no. 2, pp. 107-124, 2018.
Statistical Software Reliability Models. J. Gallego-Posada and F. Zuluaga. Data Analytics Applications in Latin America, 2017.

Supervision

Isabel Urrego - Undergraduate research project 2022
Daniel Otero - Undergraduate research project 2022
Juan Ramirez - Undergraduate research project 2020-2021; internship at Mila; now PhD student at Mila, University of Montreal

Academic Service

General chair for the LatinX in AI Workshop at ICML 2023
Student representative (a.k.a LabRep) at Mila between 2020 and 2022
Program Committee chair for the LatinX in AI Workshop at NeurIPS 2021

Conference and Journal Reviewing

Teaching Assistantships

Winter 20, 21, 22 and 23: Theoretical Principles for Deep Learning by Ioannis Mitliagkas
- This is an advanced graduate class for students who want to engage in theory-driven deep learning research.
- Topics: Convex optimization, smooth games, informatio theory, statistical learning theory. Visit the course website for the full syllabus.
- Check out the recording of my 2020 online lecture on Reproducing Kernel Hilbert Spaces!
Fall 20 and Fall 19: Probabilistic Graphical Models by Simon Lacoste-Julien
- This course is centered around the formalism of probabilistic graphical models as a tool to encode probability distributions over numerous interacting random variables.
- Topics: Graphical models: training and inference algorithms, variational inference, exponential families, information theory. Visit the course website for the full syllabus.
- These are the slides for my 2020 and 2019 guest lectures on Bayesian Non-Parametrics: Gaussian and Dirichlet Processes.
2017: Computational Intelligence - Machine Learning by Evert Haasdijk at the Vrjie Universiteit Amsterdam.