David Holzmüller

David Holzmüller

TabTalk #2: TabICLv2: A better, faster, scalable, and open tabular foundation model

Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular data. We introduce TabICLv2, a new state-of-the-art foundation model for regression and classification built on three pillars:

1) a novel synthetic data generation engine designed for high pretraining diversity;

2) various architectural innovations, including a new scalable softmax in attention improving generalization to larger datasets without prohibitive long-sequence pretraining; and

3) optimized pretraining protocols, notably replacing AdamW with the Muon optimizer.

On the TabArena and TALENT benchmarks, TabICLv2 without any tuning surpasses the performance of the current state of the art, RealTabPFN-2.5 (hyperparameter-tuned, ensembled, and fine-tuned on real data). With only moderate pretraining compute, TabICLv2 generalizes effectively to million-scale datasets under 50GB GPU memory while being markedly faster than RealTabPFN-2.5.

About the Speaker:

David Holzmüller is a researcher at INRIA Saclay (SODA team), previously a postdoc at INRIA Paris co-advised by Francis Bach and Gaël Varoquaux. He holds a PhD from the University of Stuttgart (supervised by Ingo Steinwart) and is one of the leading academic voices on ML for tabular data.He co-authored TabICL and TabICLv2, RealMLP, TabArena, and xRFM (interpretable feature learning for tabular data).

His website: https://dholzmueller.github.io/

David Holzmüller est chercheur à l'INRIA Saclay (équipe SODA); il a complété un post-doctorat à l'INRIA Paris sous la co-direction de Francis Bach et Gaël Varoquaux. Titulaire d'un doctorat de l'Université de Stuttgart (sous la direction d'Ingo Steinwart), il est l'une des voix académiques de référence sur le ML appliqué aux données tabulaires.Il est co-auteur de TabICL et TabICLv2, RealMLP, TabArena et xRFM (apprentissage de features interprétable pour données tabulaires).

Son site: https://dholzmueller.github.io/

About the format

TabTalks are a recurring series designed to support the tabular AI and time series research community. We bring together researchers, students, and faculty to hear a guest speaker share their work during a 45-minute presentation, followed by a 10–15 minute Q&A. Contact us if you wish to present!

April 16, 2026 11:00 AM

Person walking on a winding path through a green valley with large cliffs and a vertical beam of light reaching the cloudy sky.
Join us

Save your seat

Contact Details
Thank you! We'll save you a seat for this event.
Oops! Something went wrong while submitting the form.

Upcoming events

These sessions are aimed at nurturing the research community, so please expect a high level of technical detail.