"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

January 24, 2018

Day #97 - Hyperparameter tuning

How to tune hyper parameters ?
  • Which parameters affect most
  • Observe impact of change of value of parameter
  • Examine and iterate to find change of impacts
Automatic Hyper-parameter tuning libraries
  • Hyperopt
  • Scikit-optimize
  • Spearmint
  • GPyOpt
  • RoBO
  • SMAC3
Hyper parameter tuning
  • Tree Based Models (Gradient Boosted Decision Trees - XGBoost, LightGBM, CatBoost)
  • RandomForest / ExtraTrees
Neural Nets
  • Pytorch, Tensorflow, Keras
Linear Models
  • SVM, Logistic Regression
  • Vowpal, Wabbitm FTRL
Approach
  • Define function that will run our model
  • Specify range of hyper parameter
  • Adequate range for search
Results
  • Underfitting
  • Overfitting
  • Good Fit and Generalization
Tree based Models
  • GBDT - XGBoost, LightGBM, CatBoost
  • RandomForest, ExtraTrees - Scikit-learn
  • Others - RGF(baidu / fast_rgf)
GBDT
  • XGBoost - max_depth, subsample, colsample_bytree, colsample_bylevel, min_child_weight, lambda, alpha, eta num_round, seed
  • LightGBM - max_depth / num_leaves, bagging_fraction, feature_fraction, min_data_in_leaf, lambda_l1, lamda_l2, learning_rate num_iterations, seed
  • sklearn.RandomForest/ExtraTrees - N_estimators, max_depth, max_features, min_samples_leaf, n_jobs, random_state

Happy Learning!!!

No comments: