EDA is an art. Visualizations are art tools. Several different plots to prove hypothesis
Visualization Tools
Visualization Tools
- Histograms (Split into bins, how many points fall in each bins, vary number of bins) - plt.hist(x)
- XGBoost will benefit from explicit missing values
- Plots - index versus value, plt.plot(x,'.'), randomness over indices
- Statistics
- Scatter Plots (Draw one features vs other), Data distribution between train and test tests validate how they are distributed
- Correlation Plots (Run K-means clustering and reorder feature) - How similar features are
- Plot (index vs feature statistics)
- Generate new features based on groups
- ScatterPlot, Scatter matrix
- Correlation Plot (Corrplot)
- Corrplot + Clustering
- Plot (Index vs feature statistics)
More Read (Link)
Happy Learning and Coding!!!
No comments:
Post a Comment