- Compact transformation of categorical variables
- Powerful basis of feature engineering
- More stats - Percentiles, std, distribution bins
- Introducing new information from one vs all classifiers in multi-class tasks (N Different encodings)
- User to Apps relationships
- Row for user-app relationship
- Vector for each app`
- Presence of mean prev da, prev week, prev day
- Based on data create more complicated features
- model structure, analyzing trees
- Extract from decision trees (If they are in neighboring nodes)
- xgboost, row features
- Use split points to identify new features
- Manually add more mean encoded interactions
- Involving categorical variables evaluate variable interactions
Local experiments
- Estimate encodings on X_tr
- Map them to X_tr and X_val
- Regularize on X_tr
- Validate mode on X_tr / X_val split
- Estimate Encoding on whole Train data
- Map them to Train and Test
- Regularize on Train
- Fit on Train
No comments:
Post a Comment