HDSC Seminar, Fall 2023

** Abstract.** I will introduce the concept of a tensor network and overview some areas of application.
Then I will introduce the most widely used tensor network format, the matrix product state (MPS), also known as the tensor train (TT).
I will explain some of the key operations within the MPS format.

** Abstract.** I will introduce the concept of a tensor network and overview some areas of application.
Then I will introduce the most widely used tensor network format, the matrix product state (MPS), also known as the tensor train (TT).
I will explain some of the key operations within the MPS format.

** Abstract.**
Given black-box access to a tensor, we seek an efficient algorithm to compute a matrix product state
(MPS, also known as a tensor train) that approximates it with high accuracy, preferably without evaluating
the whole tensor. The most popular schemes for this problem are heuristics that hold all but one MPS core
constant while optimizing the remaining core according to some objective. The first part of this talk covers
the TT-cross algorithm, a generalization of the greedy matrix cross approximation strategy to the tensor
case. While this algorithm has few guarantees, it performs exceedingly well on tensors that admit a
low-error MPS approximation. This talk also covers strategies based on alternating least-squares, which
drives down the L2 error iteratively by solving a sequence of linear least-squares problems involving the
MPS cores. These algorithms have slightly stronger guarantees, but require more sophisticated mathematical
tools (such as statistical leverage score sampling) to avoid exponentially high computation costs. If time
permits, animations of these two algorithms will be shown on a simple toy problem.

** Abstract.**
Given black-box access to a tensor, we seek an efficient algorithm to compute a matrix product state
(MPS, also known as a tensor train) that approximates it with high accuracy, preferably without evaluating
the whole tensor. The most popular schemes for this problem are heuristics that hold all but one MPS core
constant while optimizing the remaining core according to some objective. The first part of this talk covers
the TT-cross algorithm, a generalization of the greedy matrix cross approximation strategy to the tensor
case. While this algorithm has few guarantees, it performs exceedingly well on tensors that admit a
low-error MPS approximation. This talk also covers strategies based on alternating least-squares, which
drives down the L2 error iteratively by solving a sequence of linear least-squares problems involving the
MPS cores. These algorithms have slightly stronger guarantees, but require more sophisticated mathematical
tools (such as statistical leverage score sampling) to avoid exponentially high computation costs. If time
permits, animations of these two algorithms will be shown on a simple toy problem.

** Abstract.** Review of this paper.

** Abstract.**
I will introduce the dynamic mode decomposition (DMD) in the context of fluid analysis. More generally, we will see how DMD
enables data-based characterization of the dominant modes of complex dynamical systems. Lastly, we explore the use of DMD to
predict evolution of nonlinear PDEs.

** Abstract.** Suppose we want to draw equilibrium samples from some probability density \rho(x)
--- potentially multimodal and difficult to sample from. Stochastic normalizing flows define a method starting
with samples from a different distribution \rho_0 --- typically a Gaussian --- and learning a drift field that
"flows" \rho_0 to the target \rho over a finite number of Langevin steps. This method produces unbiased samples
and can speed up sampling by a few orders of magnitude.

** Abstract.** Convex potential flows (CP-flows) form an efficient parameterization of invertible maps
for generative modeling, inspired by optimal transport (OT) theory. A CP-flow is the gradient map of a strongly
convex potential function. Maximum likelihood estimation is enabled by a specialized estimator of the gradient
of the log-determinant of the Jacobian. Theoretically, CP-Flows are universal density approximators and are optimal in the OT sense. Empirical results also show that CP-flows perform
competitively on standard benchmarks for density estimation and variational inference.

*Happy Halloween!*

** Abstract.** Review of the paper Hyper-optimized tensor network contraction.

** Abstraact.** Review of this paper.

** Abstract.** In a line, Discontinuous Galerkin (DG) methods combine the ideas of Finite Elements and Finite Volumes to
address the respective shortcomings of the two methods. Since its discovery in the 70s it has however remained
largely an academic endeavour that has seen very little use in industry. In this talk I will briefly describe
the DG method and discuss some of the challenges limiting its use. I will also introduce the Pseudospectral DG
method which attempts to address some of these issues, and discuss the possibility of applying such methods to
problems arising in quantum chemistry.

** Abstract.** In recent years, particle-based variational inference (ParVI) methods such as Stein
variational gradient descent (SVGD) have grown in popularity as scalable methods for sampling from unnormalised
probability distributions. Unfortunately, the properties of such methods invariably depend on hyperparameters
such as the learning rate, which must be carefully tuned by the practitioner in order to ensure convergence to
the target measure at a suitable rate. In this work, we introduce coin sampling, a new particle-based method for
sampling based on coin betting, which is entirely learning-rate free. We illustrate the performance of our
approach on a range of numerical examples, including several high-dimensional models and datasets, demonstrating
comparable performance to other ParVI algorithms with no need to tune a learning rate.

- Stochastic Optimal Control for CV Free Sampling of Molecular Transition Paths
- Equispaced Fourier representations for efficient GPR
- Structured matrix recovery from matvecs: Lin et al, Halikias and Townsend
- Kernel Interpolation with Sparse Grids
- A DEIM Induced CUR Factorization
- See also the references and discussion in these slides
- Randomly pivoted Cholesky and XTrace
- Belief propagation background
- Belief propagation for tensor networks
- Duality of Graphical Models and Tensor Networks
- Block Belief Propagation Algorithm for 2D Tensor Networks
- Gauging tensor networks with belief propagation
- General tensor network contraction
- Hyper-optimized tensor network contraction
- Hyper-optimized compressed contraction of tensor networks with arbitrary geometry
- Contracting Arbitrary Tensor Networks
- Arithmetic circuit tensor networks
- QTT for integral equations
- Neural networks and Gaussian processes
- Deep Neural Networks as Gaussian Processes
- Neural tangent kernel
- Wide Bayesian neural networks have a simple weight posterior
- Generative modeling
- Building Normalizing Flows with Stochastic Interpolants
- Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
- Invertible Residual Networks
- Convex Potential Flows
- Crash course in pseudospectral methods, i.e., Ch. 4-5 of Boyd's book
- Adaptive multigrid
- Sampling for lattice gauge theories, cf. Gattringer and Lang and talk to me
- Complex Langevin, cf. this review and Sec. 8 of this review