- Performs classification by obtaining and utilizing optimal separating hyperplane that separates two classes and maximizes the distance to the closest point from either class called margin
- Training involves non-linear optimization
- Objective function is convex
- So, the solution to optimization problem is relatively straight forward
- Ridge
- Lasso
- Also called as batch gradient descent
- One example at a time, move at once
- Cheaper computation
- Randomization - Escape shallow valleys, local minima, does take care of escaping silly local minima
- Simplest possible optimization
- SGD is applied in Neural Networks
Tip #4 - Gradient Descent
- Meant to minimize non-linear function
- Error measure convex function
- Finding local minimum
- Initialize -> Iterate until termination ->Adjust Learning Rate -> Terminate on local minimum
- Return Weights
Tip #5 - Bias and Variance
- Models with two few parameters may lead to High Bias
- Models with too many parameters are inaccurate due to Large Variance
No comments:
Post a Comment