Abstract. Viewing transformers with fixed weights as interacting particle systems,  the particles, representing tokens, 
		tend to cluster toward particular limiting objects as time tends to infinity. With techniques from dynamical systems and PDEs, 
		it can be shown that the type of limiting object depends on the spectrum of the value matrix.
		 
		Abstract. Last semester there was a fair amount of coverage of quantized tensor trains (QTTs) both in this 
		seminar and the Applied Math Seminar. I will tell you what I've 
		learned meanwhile about QTTs and share some thoughts and questions regarding their future in numerical analysis.
		 
		Abstract. Review of this paper and this follow-up.
		 
		Abstract. Review of this paper.
		 
		Abstract. Low-rank approximation of symmetric positive semidefinite matrices based on column subset 
		selection can enable efficient algorithms for matrix sketching, experimental design, and reduced-order modeling. 
		Determinantal point processes (DPPs) yield theoretically rigorous bounds for the worst-case optimal performance of 
		such approximations. Proofs of relevant bounds have a rich (and long) history, relating to such topics as elementary 
		symmetric polynomials, real algebraic geometry, Schur convexity, and random matrix theory. Besides the theory of 
		its worst-case performance, DPP sampling has also been the subject of numerous algorithmic implementations in recent 
		years. In this seminar, I will give a brief introduction to some applications, underlying theory, and algorithms 
		related to DPPs in low-rank approximation.
		 
		Abstract. Low-rank approximation of symmetric positive semidefinite matrices based on column subset 
		selection can enable efficient algorithms for matrix sketching, experimental design, and reduced-order modeling. 
		Determinantal point processes (DPPs) yield theoretically rigorous bounds for the worst-case optimal performance of 
		such approximations. Proofs of relevant bounds have a rich (and long) history, relating to such topics as elementary 
		symmetric polynomials, real algebraic geometry, Schur convexity, and random matrix theory. Besides the theory of 
		its worst-case performance, DPP sampling has also been the subject of numerous algorithmic implementations in recent 
		years. In this seminar, I will give a brief introduction to some applications, underlying theory, and algorithms 
		related to DPPs in low-rank approximation.
		 
		Abstract. Review of the theory of mirror descent following Yuhang's notes.
		 
		Abstract. Classical shadow is an efficient method for constructing an approximate classical 
		description of an unknown quantum state using very few measurements. It learns a minimal classical sketch, 
		the classical shadow, of the state that can be used to predict arbitrary state properties using a simple 
		median-of-means protocol. Importantly, it achieves optimal sample complexity, ensuring accurate predictions	
		of various functions of a quantum state with high success probability.
		 
		Abstract. Classical shadow is an efficient method for constructing an approximate classical 
		description of an unknown quantum state using very few measurements. It learns a minimal classical sketch, 
		the classical shadow, of the state that can be used to predict arbitrary state properties using a simple 
		median-of-means protocol. Importantly, it achieves optimal sample complexity, ensuring accurate predictions	
		of various functions of a quantum state with high success probability.
		 
		Abstract. Neural Networks (NNs) have been successfully used in various applications 
		despite a complete understanding of the learning process. Polynomial Neural Networks (PNNs) is a minimal 
		model architecture that allows, with the help of Algebraic Geometry, to shine some light into this black 
		box. In this talk, we will look at the mathematical perspective of NNs and how PNNs are connected to different 
		symmetric tensor decompositions. We will show how the inherent structure and weights of PNNs remember the 
		geometry and symmetries of learned models. In the end, we would touch on possible applications to other 
		NNs architectures (CNNs, PINNs, Binary NNs, LLMs, etc).
		 
		Abstract. For least-squares regression in the interpolation regime, the loss has an exponential 
		convergence, and there exists n such that SGD iteration with mini-batch size m < n is nearly equivalent 
		to m iterations of mini-batch 
		size 1. When m > n, SGD iteration is nearly equivalent to a full gradient descent iteration.
		
	Organizer: Michael Lindsey
HDSC Seminar, Spring 2024
	Meeting details: Thursday 1-2, Evans 1015
	Description
	Welcome to an informal seminar on high-dimensional scientific computing (HDSC). We will investigate paradigms for HDSC 
	including tensor networks, Monte Carlo methods, semidefinite programming relaxations, graphical models, neural networks, and more, as well as tools from numerical 
	linear algebra and optimization.
	Past semesters: [Fall 2023]
	Schedule
	Click for abstracts.
     February 1 
		
 Speaker: Yuhang Cai [home page] 
 Topic: Clustering in self-attention dynamics  
	 February 8 
		
 Speaker: Michael Lindsey [home page] 
 Topic: What I learned about QTTs last semester
	  
	 February 15 
		
 Speaker: Michael Kielstra [home page] 
 Topic: Approximation by exponential sums  
	 February 22 
		
 Speaker: Kevin Stubbs [home page] 
 Topic: Gauging tensor networks with belief propagation  
	 February 29 
		
 Speaker: Mark Fornace [home page] 
 Topic: Determinantal point processes in low-rank approximation  
		Selected references:
		
		Derezinski, Michał, and Michael W. Mahoney. "Determinantal point processes in randomized numerical linear algebra." Notices of the American Mathematical Society 68.1 (2021): 34-45. [link]
		
		Guruswami, Venkatesan, and Ali Kemal Sinop. "Optimal column-based low-rank matrix reconstruction." Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 2012. [link]
		
	 March 7 
		
 Speaker: Mark Fornace [home page] 
 Topic: Determinantal point processes in low-rank approximation (continued)  
		Selected references:
		
		Derezinski, Michał, and Michael W. Mahoney. "Determinantal point processes in randomized numerical linear algebra." Notices of the American Mathematical Society 68.1 (2021): 34-45. [link]
		
		Guruswami, Venkatesan, and Ali Kemal Sinop. "Optimal column-based low-rank matrix reconstruction." Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 2012. [link]
		
	 March 14 
		
 Speaker: Yuhang Cai [home page] 
 Topic: Review of mirror descent  
	 April 4 
		
 Speaker: Yizhi Shen 
 Topic: Classical shadow for predicting quantum state properties  
	 April 11 
		
 Speaker: Yizhi Shen 
 Topic: Classical shadow for predicting quantum state properties (continued)  
	 April 25 
		
 Speaker: Max Zubkov [home page] 
 Topic: Geometry of Polynomial Neural Networks, Tensors, and Beyond  
	 May 2 
		
 Speaker: Yuhang Cai [home page] 
 Topic: Optimal batchsize and stepsize of SGD in the interpolation regime  
	Sample topics to present
	In no particular order. 
	
	Machine learning
	
		
		
	
		
		
		
		
		(See this crash course for background.)
		
			[1 dimension] [general case] 
		
			
		
	Matrix sketching
	
		
		
		
	
		
		
		
		
		
	Group synchronization
	
		
		
	
				
			
	Tensor networks
	
		
		
	
				
			
		
				
			
	Sampling with sign problems
	
	
		
	
		
		
				
			
	Potpourri