"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

March 19, 2020

Day #333 - Deep Learning Guidelines

CI / CD, DL frameworks, Buy vs Develop are different sets of challenges. The more you learn, the more you feel you have a lot to learn :). Learning / doing/debugging/testing everything is part of learning. Keep going!!!

Different levels of learning are required for a different set of challenges.
  • Mastering Keras vs Pytorch vs Tensorflow 
  • Knowing Advanced features of Data Pipelines / Porting in Edge Devices
  • Building end to end the flow of Edge Analytics -> Data Consolidation -> Reporting
  • Deployment of this overall end to end solution
  • Accuracy / Understanding real-world challenges and next incremental  steps
This link provides a good guideline 

The ML tools landscape is very useful



Key Notes
Step #1 - Data
  • Data Storage
  • Data ETL Process (Workflow / Async Process)
  • Data Labelling (Raw Data -> Modelled)
  • Data Versioning

Step #2 - Development / Traning
  • DL Frameworks
  • Source code management
  • Store & Retrieve Results
  • Distributed Training

Step #3 - Deployment
  • Build Tools
  • Web Deployments
  • Monitoring predictions
  • Edge Devices / Custom Hardware Deployment
DL Frameworks


Key Notes
  • Caffe - C++ based (Fintech used Caffe)
  • Tensorflow - Google (Mobile, JS, Scalable Deployment) - Abstraction - Computational Graph
  • Keras - Wrapper on Tensorflow
  • PyTorch - FB product
ML Code Management for Training / Deployment / Serving







Key Lessons
  • Training System (Model Development)
  • Production System (Ready to use Model, Setup)
  • Serving System (Web App or anything that serves model)
On all these three levels there is a certain set of tests run to validate every layer - Train / Model / Production Serving Tests

Infrastructure (Buy vs Build)




Deep Learning Optimization






Data Versioning



Key Lessons
  • Unversioned Data (file system) (L0)
  • Version with a snapshot - Daily data (L1), Data backup with Date
  • A mix of assets and code (L2), JSON or any other labeled storage 
  • L3 - Specialized solution - DVC, Pachyderm, Quill  
Training Neural Nets: a Hacker’s Perspective
Common Coding Mistakes
  • The incorrect shape of tensors
  • Preprocessing inputs incorrectly
  • Incorrect loss function
  • Numerical computation errors (NaN)
Troubleshooting Deep Neural Networks
Troubleshooting Deep Neural Networks

Happy Learning!!!

No comments: