## June 17, 2016

## June 15, 2016

### Day #26 - R - Moving Weighted Average

Example code based on two day workshop on Azure ML module. Simple example storing and accessing data from Azure workspace

Happy Learning!!!

Happy Learning!!!

Labels:
Data Science Tips

## June 01, 2016

### Day #25 - Data Transformations in R

This post is on performing Data Transformations in R. This would be part of feature modelling. Advanced PCA will be done during later stages

Data Normalization in Python

Data Normalization in Python

**Happy Learning!!!**
Labels:
Data Science Tips

## May 20, 2016

### Day #24 - Python Code Examples

Examples for - for loop, while loop, dictionary, function examples and plotting graphs
Happy Learning!!

Labels:
Data Science Tips

### Day #23 - Newton Raphson - Gradient Descent

**Newton Raphson**

- Optimization Technique
- Newton's method tries to find a point x satisfying f'(x) = 0
- Between two successive approximations
- Stop iteration when difference between x(n+1) and x(n) is close to zero

**Formula**

**x(n+1) = x(n) - (f(x)/f'(x))**- Choose suitable value for x0

**Gradient Descent**

- Works for convex function
**x(n+1) = x(n) - af'(x)****a - learning rate**- Gradient descent tries to find such a minimum x by using information from the first derivative of f
- Both gradient and netwon raphson are similar the update rule is different

**Happy Learning!!!**

Labels:
Data Science Tips

## May 14, 2016

### Day #22 - Data science - Maths Basics

**Eigen Vector**-**Vector along which there is no change in direction**

__Eigen Value -__*Amount of Scaling factor defined by Eigen value*

__Eigen Value Decomposition -__*Only Square matrix can be performed Eigen Decomposition*

__Trace -__*Sum of Eigen Values*

__Rank of A - Number of__*Non-Zero Eigen Values*

__SVD - Singular Value Decomposition__*Swiss Army Knife of Linear Algebra**SVD - for Stock market Prediction**SVD - for Data Compression**SVD - to model sentiments**SVD is Greatest Gift of Linear Algebra to Data Science**Square Root of (Eigen Values of AtA) - A Transpose A, becomes Singular Value of*

*Happy Learning!!! (Revise - Relearn - Practice)*
Labels:
Data Science Tips

## May 09, 2016

### Day #21 - Data Science - Maths Basics - Vectors and Matrices

**Matrix**- Combination of rows and columns

Check for Linear Dependence - R2 = R2 - 2R1,

**When one of the rows is all zeros it is linearly dependent**

**Span**- Linear combination of vectors

**Rank**- Linearly Independent set

Good Related Read - Span

**Vector Space**- Space of vectors, collection of many vectors

If V,W belong to space,

**V+W also belongs to space, multiplied vector will lie in R Square**

If the

**determinant is non-zero, then the vectors are linearly independent**. Otherwise, they are linearly dependent

**Vector space properties**

- Commutative x+y = y+x
- Associative (x+y)+z = x+(y+z)
- Origin vector - Vector will all zeros, 0+x = x+0 = x
- Additive (Inverse) - For every X there exists -x such that x+(-x) = 0
- Distributivity of scalar sum, r(x+s) = rx+rs
- Distributivity of vector sum, r(x+s) = rx+rs
- Identity multiplication, 1*x = x

**Subspace**

Vector Space V, Subset W. W is called subspace of V

**Properties**

W is subspace in following conditions

- Zero vector belongs to W
- if u and v are vectors, u+v is in W (closure under +)
- if v is any vector in W, and c is any real number, c.v is in W

v = r1v1+ r2v2+... rkvk

v1,v2 distinct vectors from S, r belongs to R

**Basis**-

**Linearly Independent spanning set. Vector space is called basis if every vector in the vector space is a linear combination of set**. All basis for vector V same cardinality

**Null Space, Row Space, Column Space**

Let A be m x n matrix

**Null Space**- All solutions for Ax = 0, Null space of A, denoted by Null A, is set of all homogenous solution for Ax=0**Row Space**- Subspace of R power N spanned by row vectors is called Row Space**Column Space**- Subspace of R power N spanned by column vector is called Column Space

**Norms - Measure of length and magnitude**

- For (1,-1,2), L1 Norm = Absolute value = 1+1+2 = 4
- L1 - Same Angle
- L2 - Plane
- L3 - Sum of vectors in 3D space
- L2 norm (5,2) = 5*5+2*2 = 29
- L infinity - Max of (5,2) = 5

**Orthogonal**- Dot product equals Zero

**Orthogonality**- Linearly Independent, perpendicular will be linearly independent

**Orthogonal matrix will always have determinant +/-1**

**Happy Learning!!!**

Labels:
Data Science Tips

## May 08, 2016

### Day #20 - PCA basics

Machine Learning Algorithms adjusts itself based on the input data set. Very different from traditional rules based / logic based systems. The capability to tune itself and work according to changing data set makes it self-learning / self-updating systems. Obviously, the inputs / updated data would be supplied by humans.

**Basics**- Line is unidirectional, Square is 2D, Cube is 3D
- Fundamentally shapes are just set of points
- For a N-dimensional space it is represented in N-dimensional hypercube

**Feature Extraction**

- Converting a feature vector from Higher to lower dimension

**PCA (Principal Component Analysis)**

- Input is a large number of correlated variables We perform Orthogonal transformation, convert them into uncorrelated variables. We identify principal components based on highest variation
- Orthogonal vector - Dot product equals zero. The components perpendicular to each other
- This is achieved using SVD (Single Value Decomposition)
- SVD internally solves the matrix and identifies the Eigen Vectors
- Eigen vector does not change direction when linear transformation is applied
- PCA is used to explain variations in data. Find principal component with largest variation, Direction with next highest variation (orthogonal for first PCA)
- Rotation or Reflection is referred as Orthogonal Transformation
- PCA - Use components with high variations
- SVD - Express Data as a Matrix

**More Reads**

**Happy Learning!!!**

Labels:
Data Science Tips

## May 03, 2016

### Day #19 - Probability Basics

__Concepts__**Events**- Subset of Sample Space**Sample Space**- Set of all possible outcomes**Random Variable**- Outcome of experiment captured by Random variable**Permutation**- Ordering matters**Combination**- Ordering does not matter**Binomial**- Only two outcomes of trail**Poisson**- Events that take place over and over again. Rate of Event denoted by lambda**Geometric**- Suppose you'd like to figure out how many attempts at**something is necessary until the first success occurs**, and the probability of success is the same for each trial and the trials are independent of each other, then you'd want to use the geometric distribution**Conditional Probability**- P(A Given B) = P(A) will occur assume B has already occurred**Normal Distribution**- Appears because of central limit theorem (Gaussian and Normal Distribution both are same)

**From Quora -**

"Consider a binomial distribution with parameters n and p. The distribution is underlined by only two outcomes in the run of an independent trial- success and failure. A binomial distribution converges to a Poisson distribution when the parameter n tends to infinity and the probability of success p tends to zero. These extreme behaviours of the two parameters make the mean constant i.e. n*p = mean of Poisson distribution "

Read Michael Lamar's answer to Probability (statistics): What is difference between binominal, poisson and normal distribution? on Quora

**Happy Learning!!!!**
Labels:
Data Science Tips

## May 01, 2016

### Day #18 - Linear Regression , K Nearest Neighbours

**Linear Regression**

- Fitting straight line to set of data points
- Create line to predict new values based on previous observations
- Uses OLS (Ordinary Least Squares). Minimize squared error between each point and line
- Maximum likelihood estimation
- R squared - Fraction of total variation in Y
- 0 - R Squared - Terrible
- 1 - R Squared is good
- High R Squared good fit

**KNN**

- Supervised Machine Learning Technique
- New Data point classify based on distance between existing points
- Choice of K - Small enough to pick neighbours
- Determine value of K based on trial tests
- K nearest neighbours on scatter plot and identify neighbours

**Related Read**

Recommendation Algo Analysis

Linear Regression

**Happy Learning!!!**

Labels:
Data Science Tips

## April 22, 2016

### Neural Networks Basics

**Notes from Session**

- Neurons - Synapses. Model brain at high level
- Machine Learning - Algorithms for classification and prediction
- Mimic brain structure in technology
- Recommender engines use neural networks
- With more data we can increase accuracy of models
- Linear Regression, y = mx + b. Fit data set with little error possible.

**Neural Network**

- Equation starts from neuron
- Multiply weights to inputs (Weights are coefficients)
- Apply activation function (Depends on problem being solved)

**Basic Structure**

- Input Layer
- Hidden Layer (Multiple hidden layers) - Computation done @ hidden layer
- Output Layer
- Supervised learning (Train & Test)
- Loss function determines how error looks like
- Deep Learning - Automatic Feature Detection

**Happy Learning!!!**

Labels:
Data Science

## April 14, 2016

### Basics - SUPPORT VECTOR MACHINES

Good Reading from link

**Key Notes**- Allow non-linear decision boundaries
- SVM - Out of box supervised learning technique
- Feature Space - Finite dimensional vector space
- Each dimension represents feature
- Goal of SVN - Train a model that assigns unseen objects into particular category
- Creates linear partition of feature space
- Based on features it places above or below separation linear
- No stochastic element involved (No involvement of any previous state status)
- support vector classifiers or soft margin classifiers - allows some observations to be on in-correct side of hyperplane allowing soft margin

**Advantage**- High Dimensionality, Memory Efficiency, Versatility

**Disadvantages**- Non probabilistic

**More Reads**

**Happy Learning!!!**

Labels:
Data Science

## April 10, 2016

### Probability Tips

**Discrete random variables are things we count**- A discrete variable is a variable which can only take a countable number of values
**Probability mass function (pmf)**is a function that gives the probability that a discrete random variable is exactly equal to some value.- Continuous random variables are things we
**measure** - A
**continuous random variable**is a random variable where the data can take**infinitely**many values. **Probability density function (PDF)**, or density of a**continuous random variable**, is a function that describes the relative likelihood for this random variable to take on a given value**Bernoulli process is a finite or infinite sequence of binary random variables****Markov Chain**- stochastic model describing a sequence of possible events in which the**probability of each event depends only on the state attained in the previous event**

**Let's Continue Learning!!!**

Labels:
Data Science Tips

### Day #15 - Data Science - Maths Basics

**Day #15 - Mathematics Basics**

**Sets Basics**

- Cardinality - Number of distinct elements in Set (For a Finite Set)
- For Real numbers cardinality infinite

Fibonacci series was introduced in 1201 - Amazing :)

**Functions**

- Represents relationship between mathematical variables
- Spread of all possible output is called range
- Function that maps from A to B. A is referred as (Domain), B is referred as co-domain

**Matrix**

- Rows and columns define matrix
- 2D array of numbers
- Eigen Values - Scalars, Eigen Vector - Vectors special set of values associated with Matrix M
- Eigen Vectors - Those directions remain unchanged by action of matrix M
- Trace - Sum of diagonal elements
- Rank of Matrix - Number of linearly independent vectors

**Determinant**

- Can be computed only for square matrix

**Vector**

- Vectors have magnitude, length and direction
- Magnitude and cost of angle will give you direction
- Vector product non-commutative
- Dot product commutative
- Vector is linearly independent if none of vectors can be written as sum of multiple of other vectors

**Happy Learning!!!**

Labels:
Data Science Tips

## April 09, 2016

## April 07, 2016

### Day #13 - Maths and Data Science

- Recommender Systems - Pure matrix decomposition problem
- Deep Learning - Matrix Calculus
- Google Search - Page Rank, Social Media Graph Analysis - Eigen Decomposition

**Happy Learning!!!**

Labels:
Data Science Tips

## April 04, 2016

### Ensemble

- Combine many predictors and provide weighted average
- Use single kind of learner but multiple instances
- Collection of "Ok" predictors and combine them making them powerful
- Learn Predictors and combine them using another new model
- One layer of predictors providing features for next layer

**Happy Learning!!!**

Labels:
Data Science

## March 31, 2016

Subscribe to:
Posts (Atom)