Matrix A, Can be expressed as
A = USVt
U,V - Orthogonal
U - Left Singular Vector
V - Right Singular Vector
A is an m × n matrix
U is an m × n orthogonal matrix
S is an n × n diagonal matrix
V is an n × n orthogonal matrix
Since an m × n matrix, where m > n, will have only n singular values, in SVD this is equivalent to solving an m × m matrix using only n singular values.
- Dimensionality reduction is done by neglecting small singular values in the diagonal matrix S
- Feature of dimensionality reduction is only exploited in the decomposed version
Reference - Link
Eigen Vectors
- Satisfy AV(Vector) = L(Eigen Value)V(Eigen Vector)
- Certain Lines stretch don't change direction
- High Dimensional Data (Images, Text, Vector of Stock Data)
- Describe the data with only few values
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#https://gist.github.com/addisonhuddy/8a9e682259c9dca1f61672b4027863dc | |
import numpy as np | |
a = np.array([[1,1,1,0,2],[2,1,3,5,0],[1,3,5,6,2],[1,3,5,6,9],[2,3,4,5,6]]) | |
#set printing options | |
np.set_printoptions(suppress=True) | |
np.set_printoptions(precision=3) | |
print('FULL') | |
U,S,Vt = np.linalg.svd(a,full_matrices=True) | |
print('U') | |
print(U) | |
print('S') | |
print(S) | |
print('Vt') | |
print(Vt) | |
print('Reduced - Ignore small values') | |
U,S,Vt = np.linalg.svd(a,full_matrices=False) | |
print('U') | |
print(U) | |
print('S') | |
print(S) | |
print('Vt') | |
print(Vt) | |
from sklearn.decomposition import PCA | |
from sklearn.decomposition import TruncatedSVD | |
pca = PCA(n_components=2) | |
pca.fit(a) | |
a_transformed = pca.transform(a) | |
print('pca') | |
print(a_transformed) | |
print(pca.explained_variance_) |
How Many Singular Values Should We Retain? - A useful rule of thumb is to retain enough singular values to make up 90% of the energy in Σ, Link
SVD - (Application in NLP) - Latent Semantic Analysis Notes
- LSA applies singular value decomposition (SVD) to the matrix
- In SVD, a rectangular matrix is decomposed into the product of three other matrices
- One component matrix describes the original row entities as vectors of derived orthogonal factor values
- Another describes the original column entities in the same way
- Third is a diagonal matrix containing scaling values such that when the three components are matrix-multiplied, the original matrix is reconstructed
- The Dirichlet distribution takes a number (called alpha in most places) for each topic (or category)
Happy Mastering DL!!!
No comments:
Post a Comment