Abstract. I will mainly discuss this paper.  
		I begin by reviewing the fundamentals of multimodal learning, including CLIP pre-training, 
		zero-shot classification, conditional diffusion, and next-word prediction. I then focus on the 
		paper's Section 3.  In detail, we introduce the concept of approximate sufficient statistics, 
		a generalization of the classical sufficient statistics, and show that near-minimizers of the 
		contrastive pre-training loss are approximately sufficient, making them adaptable to diverse 
		downstream tasks.
		 
		Abstract. Continuation of last week's talk.
		
	Organizer: Michael Lindsey
HDSC Seminar, Spring 2025
	Meeting details: Thursday 11-12, Evans 736
	Description
	Welcome to an informal seminar on high-dimensional scientific computing (HDSC). We will investigate paradigms for HDSC 
	including tensor networks, Monte Carlo methods, semidefinite programming relaxations, graphical models, neural networks, and more, as well as tools from numerical 
	linear algebra and optimization.
	Past semesters: [Fall 2023] [Spring 2024]
	Schedule
	Click for abstracts.
     January 30 
		
 Speaker: Yuhang Cai [home page] 
 Topic: 
		A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI (Part I)  
	  January 30 
		
 Speaker: Yuhang Cai [home page] 
 Topic: 
		A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI (Part II)