Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #96 - Mean Encoding

January 22, 2018

Day #96 - Mean Encoding - Extensions and Generalizations

Compact transformation of categorical variables
Powerful basis of feature engineering

Using target variable in different tasks. Regression, Multi-class

More stats - Percentiles, std, distribution bins
Introducing new information from one vs all classifiers in multi-class tasks (N Different encodings)

Domains with many-to-many relationships

User to Apps relationships
Row for user-app relationship
Vector for each app`

Time-series

Presence of mean prev da, prev week, prev day
Based on data create more complicated features

Encoding interactions and numerical features

model structure, analyzing trees
Extract from decision trees (If they are in neighboring nodes)
xgboost, row features
Use split points to identify new features
Manually add more mean encoded interactions
Involving categorical variables evaluate variable interactions

Correct validation reminder
Local experiments

Estimate encodings on X_tr
Map them to X_tr and X_val
Regularize on X_tr
Validate mode on X_tr / X_val split

Submission

Estimate Encoding on whole Train data
Map them to Train and Test
Regularize on Train
Fit on Train

Happy Learning!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

January 22, 2018

Day #96 - Mean Encoding - Extensions and Generalizations

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts