"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 28, 2022

Topic Modelling - LDA, LSA

  • LDA stands for Latent Dirichlet Allocation, and it is a type of topic modeling algorithm
  • LDA was developed in 2003 by researchers David Blei, Andrew Ng and Michael Jordan
  • LDA is based on a Bayesian framework. This allows the model to infer topics based on observed data (words) through the use of conditional probabilities
  • The main difference between LSA and LDA is that LDA assumes that the distribution of topics in a document and the distribution of words in topics are Dirichlet distributions. LSA does not assume any distribution and therefore, leads to more opaque vector representations of topics and documents
  • Latent Semantic Analysis or Latent Semantic Indexing – Uses Singular Value Decomposition (SVD) on the Document-Term Matrix
  • In practice, LSA is much faster to train than LDA, but has lower accuracy.

Example


Keep Thinking!!!

No comments: