"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

February 18, 2020

Day #323 - Tech Talk Series - Data Science

Session #1 - Feature Engineering for Tabular Data

Key Notes
  • Column Aggregates
  • Independent Columns
  • Derive New Features
  • Target Encoding (Categorical Features)
  • Global Feature Encoding (Categorical Features)
Derive Features
  • Time - Months, Years, Days, WeekDays, Periods, Distance
  • Missing or Not Missing
  • Numerical Feature - Scale change, log, exp (Feature Transformations)
  • Integer Value, Decimal Value, Mod, Dividend
  • Categorical - Merge, One Hot Encoding
  • Group Features, Time them, divide them
  • Ratio Conversions
  • Binning Columns
  • Remove Outliers
  • Cluster Data and Perform Regression on it








Talk #2 - ML for Optimization Problems

Key Lessons
  •  Maximum Something (reward), Minimise something (cost)
Solution Approach
  • Linear Optimization
  • Linear Objective and set of Linear constraints
  • Dynamic Optimization (Reinforcement Learning)
  • Non Linear Optimization (Generic Algos, Simulations)
Simulation Optimization
  •  Build simulation of real life problem
ML
  • Simulate with decision variables










Happy Learning!!!

No comments: