"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

September 05, 2016

Day #30 - Machine Learning Fundamentals

Supervised Learning
  • Classification and Regression problems
  • Past data + Past outputs leveraged
  • Regression - Continuous Values
  • Classification - Discrete Labels
Unsupervised
  • Clustering - Discrete Labels
  • Dimensionality reduction - Continuous Values
Classifiers
  • SVM (Linear way of approximations)
  • KNN (Lazy learner)
  • Decision Tree (Rule based approach, Set of Rules)
  • Naive Bayes (Pick class with maximum probability)
Evaluation Methods
  • K-Fold Validation
  • Cross Validation
  • Ranking / Search - Relevance
  • Clustering - Intra-cluster and inter-cluster distances
  • Regression - Mean Square Error
  • ROC Curve 
Bagging
  • Build classifier with 30% of data
  • Again partition and build another classifier with next 30% of data
  • Random Forests - Random combination of Trees
  • Randomly decide and split on attributes
Boosting
  • Multiple weak classifiers build strong classifier
  • Sample with replacement
  • Adaboost - Adaptive boosting
Stacking
  • Use Output from one classifier as input for another classifier
  • Knn -> O/P -> SVM
Happy Learning!!!

No comments: