"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 09, 2021

Weekend lessons - Bias and Fairness

Key Lessons
  • What is Algo Bias
  • How we can identify Bias / Mitigate Bias

  • Appreciate and recognize this severity
  • Image - Watermelon


  • Based on culture, we may be inbuilt perceptions
  • Categorize, simplify, general representations
  • Sources of Algo Bias
  • Facial Bias across demographics
  • Age Detection - Performed worst on darker females
  • Different cultures different interpretations

  • Object recognition


  • Bias correlation with Income and Geography
  • World population vs dataset distribution



  • Types of Bias in Deep Learning Systems
  • Data does not include all representations
  • Data is not real-world scenarios
  • General conclusions

Interpretation Driven
  • Trends in two variables
  • cs graduates PhD trend
  • unrelated correlations

  • Does not capture fundamental driving force
  • Overgeneralization
  • Different perspectives

  • The improved dataset that accounts distribution 16

  • Procuring data of only certain situations
  • Not covering complete 100% options
Class/ Feature Imbalances in Data
  • Real-world distribution vs Model distribution
  • Frequency in dataset vs real world
  • Binary classification class
  • Moving decision boundary
  • Decision boundary shifts due to class imbalance

  • Cancer from medical images MRI Scan

Mitigation Techniques
  1. Select and Feed-in batches of class balance
  2. During learning, they will see equal distributions
  3. Reasonable decision boundary


  • Weight likelihood of individual data points for training
  • More frequent - lower weight
  • Less frequent - Higher weight
  • Inverse of frequency 

  • Lack of diversity in feature spaces
  • Hair color of images
1. Ground truth distribution of hair color
2. Ground truth distribution of Lip stick
3. Ground truth distribution of Face type
4. Ground truth distribution of Skin color


  • Bias exists in commercial-grade systems

Improve Fairness
  • Bias Mitigation
  • Bias model dataset learning pipeline

  • Evaluate Bias / Fairness
  • Fair with respect to variables when conditioned


  • Multitask learning / Adversarial Training
  • Start by specifying the attribute
  • Train model to jointly predict output



  • Skin color, pose, illumination
  • VAE to learn the underlying distribution
  • Find the distribution of latent variables




  • Approximate distribution by histogram
  • Estimated joint distribution
  • Adaptive Adjustment of Resampling probability
  • Distribution of dark vs light skin tone distribution


Happy Learning!!!

May 05, 2021

How Amazon Delivers On One-Day Shipping

 

What they may do behind the scenes?

  • The key to success is Inventory management
  • Quarterly / Early / Weekly Demands updates is the key
  • Understanding buying patterns/trends and updating the numbers mastering the accuracy
  • Demand Planning, Forecast run ahead, Stocked up earlier
  • Overstock / Before every quarter keep the items stocked up in all DCs for the anticipated frequent products in each category
  • Not all products/categories may make money but the experience will make customers come again / buy again
  • Buy in Bulk and Stock it
  • Robotics arms, Robotics package movers
  • Order early and keep good margins
  • Ship between / Leverage different countries / In a way optimize unsold inventory
  • They have their own Aircraft / Careers / Own Transportation network
  • Building own logistics network
  • Marketplace / Third party sales 
    • Path1 - Seller - Warehouse - Customer
    • Path2 - Seller - Fulfilment - Customer
  • Pickers are measured by rates / only 30 mins break / Demanding pickers
  • Amazon is good at customer service but not sure how they treat their own Employees :)

Keep Exploring!!!

Digital Transformation of Last-Mile Delivery

Key Notes
  • Grouping Deliveries
  • Everyday some set of deliveries to a subset
  • Delivery constraints - Size of truck, time sla
  • 200 Customers, 10 Trucks - different combinations
  • Academic point of view not possible
  • Never ever be able to go through all possibilities
  • Route Optimizer - Combine computer possibilities with ideas

  • Hard problem in delivery
  • Changing traffic patterns
  • Decreased parking availability
  • Smaller deliverables
  • Increasing cost per delivery
  • Challenges meeting same day delivery

Challenges
  • Too much emphasis on pieces
  • Extensive manual effort to generate route plans
  • Map on wall vs Map on computer vs real-time conditions
  • Assuming no variability and uncertainty
  • Continuous improvement is needed
  • Mobile Device = Data Capture Device
  • Data Science - Be smart dealing with data
  • Not easy thing to figure out data is right
  • With cloud store large data of customers
  • Spot problems/inefficiencies
  • Datahub is easy with tech
  • Lat / Long in the ballpark
  • Different Delivery problems
  • Ecommerce has created a headache for the delivery business
  • Handling off-cycle orders
  • Planning for truck drivers
  • Spread out
  • Keep track of master routes
  • Returns are expensive
  • Master Routes
  • Day Balance
  • Deliver similar items on same truck
  • Assign delivery day for efficient routes
  • Data requirements
  • Product data
  • Volume estimations
  • Truck data
  • Stop time = fixed time + lineitems*variabletime
  • Every customer has a distribution time
  • Drivetime
  • Google, bing, open street
  • With lots of customers route optimization difficult



  • Easier to build mobile platforms





Collect drop point samples



Keep Learning!!!