"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

October 31, 2016

Day #39 - Useful Tool MyMediaLite for Recommendations

This post is based on learning's for assignment link1, link2

Input is User-Items file as listed below


Sample Execution Command


We will be supplying parameter 20 in user20.txt to identify recommendations for user 20. The recommender type is mentioned in the --recommender parameter

Happy Learning!!!

October 10, 2016

Day #36 - Pandas Dataframe Learning's

Happy Learning!!!

Day #35 - Bias Vs Variance


These are frequently occurring terms with respect to performance of model against training and testing data sets.

Classification error = Bias + Variance

Bias (Under-fitting)
  • Bias is high if the concept class cannot model the true data  distribution well, and does not depend on training set size.
  • High Bias will lead to under-fitting
How to identify High Bias
  • Training Error will be high
  • Cross Validation error also will be high (Both will be nearly the same)
Variance(Over-fitting)
  • High Variance will lead to over-fitting
How to identify High Variance
  • Training Error will be high
  • Cross Validation error also will be Very Very High compared to training error
Hot to Fix ?
Variance decreases with more training data, and increases with more complicated classifiers

Happy Learning!!!

October 08, 2016

Day #34 - What is diffference between Logistics Regression and Naive Bayes

Both are probabilistic
Logistics
  • Discriminative (Entire approach is purely discriminative)
  • P(Y/X)
  • Final Value lies between Zero and 1
  • Formula given by exp(w0+w1x)/(exp(w0+ w1x)+1)
  • Further can be expressed as 1/(1+(exp-(w0+ w1x))
Binary Logistic Regression - 2 class
Multinomial Logistic Regression - More than 2 class

Example - Link




Link - Ref
Logistic Regression
  • Classification Model
  • Probability of success as a sigmoid function of a linear combination of features
  • y belongs to (0,1) - 2 Class problem
  • p(yi) = 1 / 1+e-(w1x1+w2x2)
  • Linear combination of features - w1x1+w2x2
  • w can be found with max likelihood estimate- 
Naive Bayes
  • Generative Model
  • P(X/ Given Y) is Naive Bayes Assumption
  • Distribution for each class
Happy Learning

October 04, 2016

October 02, 2016

Good Data Science Course Links


AI Lectures

Introduction to Machine Learning

Happy Learning!!!

Short Analytics Concept Videos



  • Descriptive Analysis (Analysis of existing data, Trends and Patterns), 
  • Diagnostic Analysis (Reasons / Patterns behind events)
  • Predictive Analytics (Future how will it look like) 
  • Prescriptive Analysis (How to be prepared / handle the future)

Great Compilation, Keep Learning!!!

October 01, 2016

Day #32 - Regularization in Machine Learning


A large coefficient will result in overfitting. To avoid we perform regularization. Regularization - To avoid overfitting
  • L1 - Sum of values (Lasso - Least absolute shrinkage and selection operator). L1 will be meeting in co-ordinates and result in one of the dimensions zero. This would result in variable elimination. The features that minimally contribute will be ignored.
  • L2 - Sum of squares of values (Ridge). L2 is kind of circle shaped. This will shrink all coefficient in same proportion but eliminate none
  • Discriminative - In SVM we use hyperplane to classify the classes. This is example for discriminative approach
  • Probabilistic - Generated by Gauss Distribution. This is again based on Central Limit Theorem. Infinite points will fit into a Normal distribution. Here we apply gauss distribution model
  • Max Likelihood - Probability that the point p belongs to one distribution. 
Good Read for L2 - Indeed, using the L2 loss comes from the assumption that the data is drawn from a Gaussian distribution

Another Read -

  • L1 Loss function minimizes the absolute differences between the estimated values and the existing target values. L1 loss function is more robust and is generally not affected by outliers
  • L2 loss function minimizes the squared differences between the estimated and existing target values. L2 error will be much larger in the case of outliers 

Happy Learning!!!