"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

May 09, 2021

Weekend lessons - Bias and Fairness

Key Lessons
  • What is Algo Bias
  • How we can identify Bias / Mitigate Bias

  • Appreciate and recognize this severity
  • Image - Watermelon


  • Based on culture, we may be inbuilt perceptions
  • Categorize, simplify, general representations
  • Sources of Algo Bias
  • Facial Bias across demographics
  • Age Detection - Performed worst on darker females
  • Different cultures different interpretations

  • Object recognition


  • Bias correlation with Income and Geography
  • World population vs dataset distribution



  • Types of Bias in Deep Learning Systems
  • Data does not include all representations
  • Data is not real-world scenarios
  • General conclusions

Interpretation Driven
  • Trends in two variables
  • cs graduates PhD trend
  • unrelated correlations

  • Does not capture fundamental driving force
  • Overgeneralization
  • Different perspectives

  • The improved dataset that accounts distribution 16

  • Procuring data of only certain situations
  • Not covering complete 100% options
Class/ Feature Imbalances in Data
  • Real-world distribution vs Model distribution
  • Frequency in dataset vs real world
  • Binary classification class
  • Moving decision boundary
  • Decision boundary shifts due to class imbalance

  • Cancer from medical images MRI Scan

Mitigation Techniques
  1. Select and Feed-in batches of class balance
  2. During learning, they will see equal distributions
  3. Reasonable decision boundary


  • Weight likelihood of individual data points for training
  • More frequent - lower weight
  • Less frequent - Higher weight
  • Inverse of frequency 

  • Lack of diversity in feature spaces
  • Hair color of images
1. Ground truth distribution of hair color
2. Ground truth distribution of Lip stick
3. Ground truth distribution of Face type
4. Ground truth distribution of Skin color


  • Bias exists in commercial-grade systems

Improve Fairness
  • Bias Mitigation
  • Bias model dataset learning pipeline

  • Evaluate Bias / Fairness
  • Fair with respect to variables when conditioned


  • Multitask learning / Adversarial Training
  • Start by specifying the attribute
  • Train model to jointly predict output



  • Skin color, pose, illumination
  • VAE to learn the underlying distribution
  • Find the distribution of latent variables




  • Approximate distribution by histogram
  • Estimated joint distribution
  • Adaptive Adjustment of Resampling probability
  • Distribution of dark vs light skin tone distribution




Ref - Link
Happy Learning!!!

No comments: