Mathematics and Computer Science Division
Argonne National Laboratory
"Large-Scale Numerical Linear Algebra Techniques for Big Data Analysis"
As the term "big data" appears more and more frequently in our daily life and research activities, it changes our knowledge of how large the scale of the data can be and challenges the application of numerical analysis for performing statistical calculations. In this talk, I will focus on two basic statistics problems---sampling a multivariate normal distribution and maximum likelihood estimation---and illustrate the scalability issue that dense numerical linear algebra techniques are facing. The large-scale challenge motivates us to develop scalable methods for dense matrices, which often come from data analysis. I will present several recent developments on the computations of matrix functions and on the solution of a linear system of equations, where the matrices therein are large-scale, fully dense, but structured. The driving ideas of these developments are the exploration of the structures and the use of fast matrix-vector multiplications to reduce the general quadratic cost in storage and cubic cost in computation. "Big data" provides a fresh opportunity for numerical analysts to develop algorithms with a central goal of scalability in mind. Scalable algorithms are key for convincing statisticians and practitioners to apply the powerful statistical theories on large-scale data that they currently feel uncomfortable to handle.