"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 01, 2016

Day #18 - Linear Regression , K Nearest Neighbours

Linear Regression
  • Fitting straight line to set of data points
  • Create line to predict new values based on previous observations
  • Uses OLS (Ordinary Least Squares). Minimize squared error between each point and line
  • Maximum likelihood estimation
  • R squared - Fraction of total variation in Y
  • 0 - R Squared - Terrible
  • 1 - R Squared is good
  • High R Squared good fit

  • Supervised Machine Learning Technique
  • New Data point classify based on distance between existing points
  • Choice of K - Small enough to pick neighbours
  • Determine value of K based on trial tests
  • K nearest neighbours on scatter plot and identify neighbours

Related Read
Recommendation Algo Analysis

Happy Learning!!!

April 22, 2016

Day #17 - Python Basics

Happy Learning!!!

Neural Networks Basics

Notes from Session
  • Neurons - Synapses. Model brain at high level
  • Machine Learning  - Algorithms for classification and prediction
  • Mimic brain structure in technology
  • Recommender engines use neural networks
  • With more data we can increase accuracy of models
  • Linear Regression, y = mx + b. Fit data set with little error possible.
Neural Network
  • Equation starts from neuron
  • Multiply weights to inputs (Weights are coefficients)
  • Apply activation function (Depends on problem being solved)
Basic Structure
  • Input Layer
  • Hidden Layer (Multiple hidden layers) - Computation done @ hidden layer
  • Output Layer
  • Supervised learning (Train & Test)
  • Loss function determines how error looks like
  • Deep Learning - Automatic Feature Detection

Happy Learning!!!

April 14, 2016


Good Reading from link

Key Notes
  • Allow non-linear decision boundaries
  • SVM - Out of box supervised learning technique
  • Feature Space - Finite dimensional vector space
  • Each dimension represents feature
  • Goal of SVN - Train a model that assigns unseen objects into particular category
  • Creates linear partition of feature space
  • Based on features it places above or below separation linear
  • No stochastic element involved (No involvement of any previous state status)
  • support vector classifiers or soft margin classifiers - allows some observations to be on in-correct side of hyperplane allowing soft margin
  • High Dimensionality, Memory Efficiency, Versatility
  • Non probabilistic
More Reads

Happy Learning!!!

Day #16 - Python Basics

Happy Learning!!!

April 10, 2016

Probability Tips

  • Discrete random variables are things we count
  • A discrete variable is a variable which can only take a countable number of values
  • Probability mass function (pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value.
  • Continuous random variables are things we measure
  • A continuous random variable is a random variable where the data can take infinitely many values.
  • Probability density function (PDF), or density of a continuous random variable, is a function that describes the relative likelihood for this random variable to take on a given value
  • Bernoulli process is a finite or infinite sequence of binary random variables
  • Markov Chain - stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event
Let's Continue Learning!!!

Day #15 - Data Science - Maths Basics

Day #15 - Mathematics Basics

Sets Basics
  • Cardinality - Number of distinct elements in Set (For a Finite Set)
  • For Real numbers cardinality infinite
Rational Numbers - Made of Ratio of two numbers
Fibonacci series was introduced in 1201 - Amazing :)

  • Represents relationship between mathematical variables
  • Spread of all possible output is called range
  • Function that maps from A to B. A is referred as (Domain), B is referred as co-domain
  • Rows and columns define matrix 
  • 2D array of numbers 
  • Eigen Values - Scalars, Eigen Vector - Vectors special set of values associated with Matrix M
  • Eigen Vectors - Those directions remain unchanged by action of matrix M
  • Trace - Sum of diagonal elements
  • Rank of Matrix - Number of linearly independent vectors
  • Can be computed only for square matrix
  • Vectors have magnitude, length and direction
  •  Magnitude and cost of angle will give you direction
  • Vector product non-commutative
  •  Dot product commutative
  •  Vector is linearly independent if none of vectors can be written as sum of multiple of other vectors

 Happy Learning!!!

April 09, 2016

Day #14 - R Working Tips

Happy Learning!!!

April 07, 2016

Day #13 - Maths and Data Science

  • Recommender Systems - Pure matrix decomposition problem
  • Deep Learning - Matrix Calculus
  • Google Search - Page Rank, Social Media Graph Analysis - Eigen Decomposition
Happy Learning!!!

April 04, 2016


  • Combine many predictors and provide weighted average
  • Use single kind of learner but multiple instances
  • Collection of "Ok" predictors and combine them making them powerful
  • Learn Predictors and combine them using another new model 
  • One layer of predictors providing features for next layer 
Happy Learning!!!

March 31, 2016

Data Science - Good Reads

Good Reads!!!

March 29, 2016

Good Data Science Tech Talk

Today I spent some time with Tech talk on predictive modelling. Good Coverage of fundamentals, Needs revision again.

Read William Chen's answer to What are the most common mistakes made by aspiring data scientists? on Quora

Happy Learning!!!!

March 28, 2016

Data Science Day #12 - Text Processing in R

Today's post is basics on text processing using R. We look at removing stop words, numbers, punctuations, lower case conversion etc..

Happy Learning!!!

March 27, 2016

Day #11 - Data Science Learning Notes - Evaluating a model

R Square - Goodness of Fit Test
  • R square = (1- (Sum of Squares of Error/Sum of Squares of Total))
  • SST - Variance of dependent variable
  • SSE - Variance of Actual vs Predicted Values
Adjusted R Square 
  • Adjusted R Square = (1-((n-1)/(n-p-1)))(1-RSquare)
  • P - Number of independent variables
  • n - records in dataset
RMSE (Root mean square error)
  • For every record predicted compute error 
  • Square it and find mean
  • RMSE error should be same for training and testing dataset
Bias (Underfit)
  • Model can't explain the dataset
  • R Square value very less
  • Add more Independent variable
  • RMSE High for test dataset, RMSE low for training dataset
  • Cut down Independent variable
Collinearity Problem
  • Conduct P test to validate null hypothesis is valid
Next Pending Reads
  • Subset Selection Technique
  • Cross Validation Technique
  • Z test / P Test
Happy Learning!!!

March 22, 2016

Day #10 Data Science Learning - Correlations

  • If you have correlation you can use machine learning to predict variables
  • Mutual relationship connection between two or more things
  • Correlation shows inter dependence between two variables
  • Measure - How much one changes when other also changes ?
  • Popularly Used - Pearson Correlation coefficient
  • Value ranges from -1 to +1
  • Negative correlation (Closer to -1) - One value goes up other goes down
  • Closer to Zero (No Correlation)
  • Closer to 1 (Positive Correlation)
Correlation - Relationship between two values
Causation - Reason for change in value (Cholesterol vs weight, Dress Size Vs Cholesterol). Identify if it is incidental.

Happy Learning!!!

March 21, 2016

March 20, 2016

Data Science Tip Day #8 - Dealing with Skewness for Error histograms after Linear Regression

In previous posts we have seen histogram based validation for errors. When a left / right skew based distribution is observed some transformation techniques to apply are
  • Right Skewed - Apply Log
  • Slightly Right - Square root
  • Left Skewed - Exponential
  • Slightly Left - Square root, Cube
Happy Learning!!!

March 19, 2016

Data Science Tip Day#7 - Interaction Variables

This post is using interaction variables while performing linear regression

For illustration purpose lets construct some datasets with a three vectors (y,x,z)

Happy Learning!!!