"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

March 03, 2020

My Observations in Data Science Work


  • Code Credit - I have observed people take code from git/forums and change headers/rewrite methods without providing due credit. I call out explicitly the reference/approach from paper/presentation/tech talks. The original work would have taken months to arrive at it. It is better to reference it for anyone to understand the reasons behind the design
  • Domain Knowledge - Ignoring the value of domain knowledge - To find a possible loss item, there is a mix of techniques in Retail employed. Implementing the same as events/actions may be the dumbest way for someone who undervalues domain knowledge :)
  • Code Management - Treating ML code as Enterprise Software - Models are deployed without source code on edge devices. We use disparate sources of data. My recommendation is to build a high accuracy model and then consider merging the code into a common code / common class.
  • Resellers - Do not get carried away with third-party tools, frameworks. Try with open source, build your datasets, better models. Buying from the third party AI software and re-selling it we cannot call it as 'Analytics Company' :)
  • ROI - Set realistic expectations, target the low hanging fruits, understand ML as 'Preparedness'. ML cannot be quantified in terms of revenue unless it is a clear business case like Market Basket Analysis, Sold together, Bought together

Keep Thinking!!!

No comments: