DSC 291: Numerical Linear Algebra for Data Science

TuTh 9:30am-11:00am, WLH 2114

Announcements

  • Welcome to the course!

Instructor: Alex Cloninger

  • Email: acloninger (at) ucsd (dot) edu

  • Phone: 534-4889

  • Office: AP&M 5747 (5th floor annex), SDSC 206E

  • Office hours: by appointment

Overview:

This course will cover algorithms and theory in linear algebra, with a focus on data science applications. The course will only assume familiarity with an undergraduate course in linear algebra and matrices, and basic familiarity with Python and/or basic scientific programming. Topics will include: Linear algebraic systems, least squares problems and regularization, orthogonalization methods, ill-conditioned problems, eigenvalue and singular value decomposition, principal component analysis, structured matrix factorization and fast algorithms, randomized linear algebra, JL lemma, sparse approximations.

A schedule of the course can be found below. There is not a textbook for this course, however there are several books that can be useful for reference:

  • Numerical Linear Algebra By Lloyd N. Trefethen, David Bau, III

  • Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares By Stephen Boyd and Lieven Vandenberghe

Grades:

There will be three small projects throughout the course of the quarter. These will be meant to gain practical knowledge and use of the topics discussed in the course, and to gain exposure to some of the data science applications of linear algebra. These projects will be graded on completeness, correctness, and clarity of the notebook/writeup.

At the end of the quarter, students will be asked to form small groups and delve deeper into one project of interest, draw connections to existing research, and have a short presentation for the class.

Project topics:

Schedule:

  • September 23: Introduction, Matrix Vector Multiplication, Inner Products

  • September 28: Orthogonal Vectors, Norms, Inequalities

  • September 30: Singular Value Decomposition

  • October 5: Principle Component Analysis (also see Project 1)

  • October 7: PSD Matrices, Cholesky Decomposition, Forward/Back Substitution

  • October 12: Banded Matrices, QR Factorization

  • October 14: Regression, Computing SVD

  • October 19: Conditioning and Stability

  • October 21: Regularized Regression

  • October 26: Kernel Regression and Bias-Variance

  • October 28: Kernel QR and Eigenvalue Methods

  • November 2: Graphs and Graph Embeddings

  • November 4: Graph Embeddings and Cuts

  • November 9: Iterative Algorithms

  • November 11: No Class for Veterans Day

  • November 16: Gradient Descent and NTK

  • November 18: