"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

January 27, 2018

Day #100 - Ensemble Methods

It took more than a year to reach 100 posts. This is a significant milestone. Hoping to reach 200 soon.
  • Combining different machine learning models  for more powerful prediction
  • Averaging or blending
  • Weighted averaging
  • Conditional averaging
  • Bagging
  • Boosting
  • Stacking
  • Stacknet
Averaging ensemble methods
  • Combine two results with simple averaging
  • (model1+model2)/2
  • Considerable improvements with averaging can be achieved
  • Perform better when combined but not individually
  • Weighted average - (model1*0.7 + model2*0.3)
  • Conditional average (If < 50 use model1 else model2)
Bagging
  • Averaging slightly different versions of same model to improve accuracy
  • Example - Random Forest
  • Underfitting - Error in bias
  • Overfitting - Errors in variance
  • Parameters that control bagging - Seed, Subsampling or Bootstrapping, Shuffling, Column Subsampling, Model specific parameters, bags (number of models), More bags better results, parallelism
  • BaggingClassifier and BaggingRegressor from sklearn
  • Independant of each other
Boosting
  • Weight based boosting
  • Form of weighted averaging of models where each model is built sequentially via taking into account of past model performance
  • Add sequentially how well previous models have done
Weight based boosting
  • Number of times certain row appears in data
  • Contribution to error / recalculate weights
  • Parameters - Learning rate, shrinkage, trust many models, number of estimators
  • Parameters - Adaboost (sklearn - python), LogitBoost (Weka - java)
Residual based boosting
  • For videos mostly dominant
  • Calculate error of predictions / direction of error
  • Make Error new target variable
  • Parameters - Learning Rate, Shrinkage, ETA
  • Number of estimators
  • Row sub sampling
  • Column sub sampling
  • Sub boosting type - Fully gradient based, Dart
  • XGboost
  • Lightgbm
  • H2O GBM (Handle categorical variables out of box)
  • Catboost
  • Sklearn's GBM
Stacking
  • Making several predictions of a number of models in a hold out set and then using a different meta model to train these predictions
  • Stacking predictions
  • Splitting training set into two disjoint sets
  • Train several base learners on the first part
  • Make predictions with the base learners on the second (validation) part
  • Using predictions from (3) as the input to train a higher level learner
  • Train Algo 0 on A and make predictions for B and C and Save to B1, C1
  • Train Algo 1 on A and make predictions for B and C and Save to B1, C1
  • Train Algo 2 on A and make predictions for B and C and Save to B1, C1
  • Train Algorithm3 on B1 and make predictions for C1

Happy Learning!!!

No comments: