"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

August 30, 2019

Day #272 - Clustering to find Data Insights

For different kinds of data, we need to pick the right columns to find the right insights. Most of them can be picked up with domain knowledge, Clustering perspective analysis.
  • For Sales Data, Clustering intent was to find sales insights. Data Sources were CustomerId, NumberOfOrders, TotalOrderValue. Clustered the same to find order value buckets. High Value, Medium, Low Value.
  • For Loss in Retail Store, Clustering insights to find loss patterns. Data Sources were SkuId, LossCount, LossValue. Clustered them to High Loss, Medium, Low Loss buckets.
This helps to address the key loss items and focus on proactive measures to prevent further loss. After a long time picked up R. Forgot execution command Ctrl+A, Ctrl+Enter. Every editor, tools, language have their own patterns, formats of coding.

This is the same as 'Recency, Frequency and Monetary value'. I came to know about this today (7/6/2020). Sometimes what you already implemented might be already done in some pattern :)

Recency, Frequency, Monetary Model with Python — and how Sephora uses it to optimize their Google and Facebook Ads
Find Your Best Customers with Customer Segmentation in Python
Introduction to Customer Segmentation in Python

Happy Learning!!!

No comments: