"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

April 14, 2022

Federated Learning - How - Why - When

Summary from Quick 5 mins Tutorial

  • Client trains on data available at the device
  • Decentralized data 
  • Start with a model shared from server to clients
  • Clients which have sufficient data / Models deployed to them
  • Trained on local data and model sent to the server
  • Weights / Biases are shared with server
  • The server averages all the weights and creates final model
  • A collaborative and decentralized approach

Link - Session

Questions / Next Steps

  • Server Configuration, Tools, Package required
  • Client Configuration, Tools, Package required
  • How to train / run 
  • How the model gets updated between multiple clients
  • Similar to data synchronization need to investigate on infra needs to run

TensorFlow Federated (TFF) is an open-source framework for experimenting with machine learning and other computations on decentralized data. TFF runtimes to become available for the major device platforms

Code Example - Link

Tensorflow Federated Tutorials

Code - Link

Observations for code Walkthrough

  • Federated learning requires a federated data set
  • TFF repository with a few datasets, including a federated version of MNIST
  • Would simply sample a random subset of the clients to be involved in each round of training
  • Constructing an instance of tff.learning.Model

Research paper - Communication-Efficient Learning of Deep Networks from Decentralized Data

Key Notes

  • Decentralized approach Federated Learning.

Ideal problems for federated learning have the following properties: 

  • Training on real-world data from mobile devices provides a distinct advantage over training on proxy data
  • This data is privacy sensitive or large in size (compared to the size of the model)

Federated optimization has several key properties

  • Massively distributed, Limited communication 

There are two primary ways we can add computation: 

1) increased parallelism, where we use more clients working independently between each communication round; and, 

2) increased computation on each client, where rather than performing a simple computation like a gradient calculation, each client performs a more complex calculation between each communication round

Tensor Processing Units (TPUs) are Google's custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads

From Link

  • Differential Privacy - Adding Noise to Ensure Privacy
  • Secure Aggregation - The server can only see bulk updates
  • Privacy is paramount in Federated learning
  • IID - Independently Identifiable Data
  • Privacy and Fairness are in the opposite direction

CPU vs GPU vs TPU

  • CPU - Small models with small, effective batch sizes
  • GPU - Models with a significant number of custom TensorFlow/PyTorch/JAX operations that must run at least partially on CPUs. Medium-to-large models with larger effective batch sizes
  • TPU - Models that train for weeks or months. Large models with large effective batch sizes

Federated Learning in Vision Tasks | Umberto Michieli, PhD@Uni of Padova, Intern@Samsung Research

My Feedback - let's collect minimal data and build models, Before we start to run, let's learn to crawl and walk :)

Keep Exploring!!!

No comments: