"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 13, 2018

Day #162 - NLP Class Notes

Great Collection of Classes - http://ml4a.github.io/classes/itp-F18/

Applications of NLP
  • Spell Check
  • Translation
  • Sentiment Analysis and other prediction tasks
  • Summarization
  • Spam filtering
  • chatbots
  • Parsing semantic information
Key Summary
  • Words can be treated like data
Embedding
  • Give relationships between data points
  • Magnitude and direction have meaning and allow many basic retrieval applications
  • Feature vectors and latent spaces and examples of embeddings
  • They are similar if they appear near each other
  • Project words in space/vectors and relationships can be determined
Word feature vectors
  • Deep Neural network based
  • Skip Gram Model - Represent every word as one hot vector, One index represents every single word
  • Input one hot vector - Hidden layer has n elements - It will eventually associate with words, Training next word prediction
  • CBOW - Continuous Bag of Words
  • Two words have similar word vector because they appear frequently interchangeably
  • Documents embedded in language vector space, Embed language into space
  • Universal Sentence Encoder Paper - https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46808.pdf
  • Create Embedding / Next Semantic Similarity
  • Text to Image (StackGAN) - Projection Involved
  • Hierarchical Neural Story Generation, Poetry Generation, Multilingual Unsupervised or Supervised word Embeddings (Papers)
  • Language Translation as seq2seq problem (input sequence of words vs output sequence of words)
  • LSA - Similar to PCA
  • LDA - Technique for embedding text in feature space


Happy Mastering DL!!!

No comments: