Principal Component Analysis (PCA) is frequently applied in machine learning as a sort of black box dimensionality reduction technique. PCA can be arrived at as an expression of a best fit probability distribution for our data. Treating PCA as a probability distribution opens up all sorts of fruitful avenues, we can draw new examples from the learned distribution and/or evaluate the likelihood of samples as we observe them to detect outliers.

Intermezzo: Sparsely Observed Data¶

In the post on using PCA for data imputation we used a weight for each of our data points. By assigning a weight of 0 to missing data and a weight of 1 to the rest of our data we managed to be able to get a reasonably good approximation to what we would find using PCA on the dataset without any data missing.

This is fine when evaluating a dense model for our data matrix is not too much computational overhead. However when our input data are sparsely observed, that is to say most of our data consists of missing values then evaluating the model densely is a tremendous waste of computational resources.

Uses For PCA Other Than Dimensionality Reduction Part 2¶

Imputation, and Noise Reduction¶

Principal Component Analysis (PCA) is frequently applied in machine learning as a sort of black box dimensionality reduction technique. However with a deeper understanding of what PCA is and what it does we can use it for all manner of other tasks e.g.

Uses For PCA Other Than Dimensionality Reduction Part I¶

Decorrelation, Factor Discovery, and Noise Modeling¶

Principal Component Analysis (PCA) is frequently applied in machine learning as a sort of black box dimensionality reduction technique. However with a deeper understanding of what PCA is and what it does we can use it for all manner of other tasks e.g.

Decorrelating Variables
Semantic Factor Discovery
Empirical Noise Modeling
Missing Data Imputation
Example Generation
Anomaly Detection
Patchwise Modeling
Noise Reduction

We will demonstrate how to use PCA for these purposes on an example face dataset. In this first post we will handle up till empirical noise modeling and handle the rest in subsequent parts.

Asymptotic Labs (Posts about PCA)

Glimpses of the Sudoku-tope

PCA and probabilities

Eigen-Techno

Low Rank Approximation On Sparsely Observed Data

Intermezzo: Sparsely Observed Data¶

Imputing Missing Values With PCA

Uses For PCA Other Than Dimensionality Reduction Part 2¶

Imputation, and Noise Reduction¶

Uses for PCA other than dimensionality reduction part 1

Uses For PCA Other Than Dimensionality Reduction Part I¶

Decorrelation, Factor Discovery, and Noise Modeling¶