Abstract. I will mainly discuss this paper.
I begin by reviewing the fundamentals of multimodal learning, including CLIP pre-training,
zero-shot classification, conditional diffusion, and next-word prediction. I then focus on the
paper's Section 3. In detail, we introduce the concept of approximate sufficient statistics,
a generalization of the classical sufficient statistics, and show that near-minimizers of the
contrastive pre-training loss are approximately sufficient, making them adaptable to diverse
downstream tasks.
Abstract. Continuation of last week's talk.
Organizer: Michael Lindsey
HDSC Seminar, Spring 2025
Meeting details: Thursday 11-12, Evans 736
Description
Welcome to an informal seminar on high-dimensional scientific computing (HDSC). We will investigate paradigms for HDSC
including tensor networks, Monte Carlo methods, semidefinite programming relaxations, graphical models, neural networks, and more, as well as tools from numerical
linear algebra and optimization.
Past semesters: [Fall 2023] [Spring 2024]
Schedule
Click for abstracts.
January 30
Speaker: Yuhang Cai [home page]
Topic:
A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI (Part I)
January 30
Speaker: Yuhang Cai [home page]
Topic:
A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI (Part II)