"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

April 30, 2019

Data story behind Food delivery Apps

Since I use food delivery apps heavily. Both swiggy, ubereats. My views and reflections of data story/measures/ machine learning use cases from these applications

My observations based on application use. Ubereats highlights below activities based on historical data collected
  • Previously ordered restaurants 
  • Previously ordered items highlighted
  • Review based listings 
  • Projecting estimated delivery times
I personally face challenges while trying to shift to a low-calorie diet as recommendations are more tuned for past orders.
  • Recommending a similar item every day from other restaurants based on historical data
  • No option to set preferences for the coming week - Balanced diet customized to need /preferences based on user choices for a week 
  • Fold quality issues exist no matter how good review ratings are
I have worked on real-time systems, reporting and moved to AI. Now we have all tools to query data in motion, historical data and future data forecast. This view provides a complete end to end perspective to understand data, numbers. Some of below metrics/ measure overlap across transactions/ historical data / ai

Key Metrics / Measures
  • Average Order delivery time at different times (Morning / Lunch / Evening / Holidays / Weekends)
  • Average Order Order pickup time at different times
  • Order acceptance rate
  • Clicks/ conversions 
  • A/B experiment and conversions 
  • Payments type vs orders
  • Average menu browsing time
  • Frequently searched items across days / restaurants / seasons
  • Predict order delays using Traffic data
  • Peak seller's 
  • Top customers 
  • Weekday trends
  • Top trends based on seasonality 
Data science use cases
  • Forecast on volumes of items based on historical data 
  • OCR, Recommendation at User Level / Sold together items
  • Deep learning for automated food classification, tagging
  • Segmenting customers based on Age / Gender / Veg / Non-Veg / Cusine Choices and providing recommendations
  • Forecast Order Volumes and assign Delivery partners based on Projected numbers to reduce other delays
Tech Talk - 


Everything that is measurable can be managed, monitored, improved. There have to be more quality aspects to be integrated as we risk ourselves trusting rating for better quality. Hope quality bar keeps improving and story evolves into another version customizing based on personal diet plans and choices. Happy finding the data story behind these food delivery businesses Apps !!!

April 28, 2019

Day #245 - Exploratory Data Analysis Goals

  • Find Insights, Turn it into Why-questions
  • Seek Surprises sudden peaks, lows, Harness it into How Questions
  • Plot data in different dimensions Month / Year / Sales, Find Insights in every perspective
Learn the story behind the numbers!!!

Happy Mastering DL!!!

Innovation Session Notes

Today was interesting. Attended Innovation Session. Very inspiring, motivational session. Thanks Akshay Cherian.
  • Communicating to brain
  • Communicating with emotion
Different types of Questions
  • Know How
  • Know When
  • Know What
  • Know Why
Great Learnings / Habits
  • Listening at different levels
  • Insights learnt
  • Find Insights, Turn it into question
  • Seek Surpises, Harness it
  • Insights + States of Flow loop on each other
  • We can perform when it keeps us exciting and not overwhelming
  • Creativity is presence of constraints
  • Create disproportionate value
  • Solve by doing, Comfortable with failure
  • It will hurt a bit if you are doing something meaningful
Books
Geography of Genius
States of Flow Assessment
Steps
  • List, Cluster, Reorder
  • Start Challenge vs Idea
  • Turn Top barriers into Questions
  • Rate Idea Valuable / Simple
  • Turn Ideas into Steps
  • Review Ideas
  • Create and Share output
Fresher Tips
  • Work for free until they see value
Challenges / Problems / Opportunities
  • Redefinition of problem 
  • Think from Celebration
  • Pissed of is better than passion
  • Treat it as a Games
  • Breakthrough tools
  • Breakthrough environment
  • Don't define the problem by Single word
  • Meeting for Evolution not Evaluation of ideas
  • Reduce perceived Risk of Sharing
Futuristic
  • Don't operate from crisis to crisis
  • Anticipate and prepare for next-gen risks
  • Plan a budget for time and money
  • Manage time and invest time differently
  • Learn a combination of skills
Key Lessons
  • Like how we pay Monthly Bills, How much did you budget your time and money for learning, upskill or coding. I felt slap on face with this question.
  • If someone is in a role where you want to be. Look at his previous role and see what he has done to move to that role
  • If someone is successful see the pattern, practice they follow don't put it as luck, motivation. Emulate them
Sharing my Earlier Work

Why it fails


Happy Learning!!!

April 27, 2019

Day #244 - Data Annotation Guidelines

  • Quality of Images and capturing significant traits like styles / shapes / colors
  • Object to annotate captured from nearest possible view (Best Possible Angle)
  • Impact of poor background light / night and too far images. Discard low quality / poor miniature of objects (Occurs in edges of image)
  • Handling partial objects
  • When Annotating multiple objects the class imbalance factors between them, Fix before training. Analyzing Number of Objects, Occurrences - Distribution for Sampling balances
  • Check for Data set impacts for Daylight / Night and annotate / train / build model accordingly
Guideline - The object under training occupies the center spot / nearest closest better view to know the side / front or reasonably good amount of features like color / styles

Good data / quality data is as important than the model / approach we take.

Happy Mastering DL!!!

April 26, 2019

Day #243 - Retail Analytics Opportunities

Notes from Recently Attended Session
Instore Retail
  • Store Managers
  • Associates usage
  • Online / Offline Users
Customer Experience
  • Optimize layout
  • Improve promotion effectiveness
  • Shopper Journey Outcome
Store Performance
  • Segmentation 
  • Forecasting
Store Managers
  • People Person
  • Genuine interest for customers, supervising, nurture ideas
  • People Skills, Sales Skills, Management Skills
Store Manager
  • Stocking
  • Delegate Activity
  • Customer Service
  • Sales projections
  • Readiness
As Required
  • Forecasting
  • Staffing
  • Hiring
  • Associate Performance
Challenges
  • Difficult Customers
  • Personal issues
  • Customer Expectations
  • Not get yelled at
Sale
  • Conversation that ends in a transaction
  • Know better, focus on training people
  • Product knowledge
Apple Selling Philosophy
  • A - Approach (Welcome Approach)
  • P - Probe - Needs
  • P - Provide Solutions
  • L - Listen Concerns / Issues
  • E - End with farewell / Invitation
Power Hours - Most Sale time / peak hours
  • Sales Split by hour
  • Sales Split by Week
  • Labour vs Traffic Approach
  • %% of sales at that Hour
Types of Retailers
  • Malls - Retailers - Pantaloon / Shoppers Shop
  • Individual Stores - Kirana Stores
  • Online Sales - Amazon / Flipkart
  • Chain of Stores - Pothys / Chennai Silks (Have own sourcing units) - Brand Conscious
Power Centers
  • Maximum Value Proposition
  • Competitive prices
  • Volume & competitive price
Factors to Setup Stores
  • Demographics
  • Income levels
  • Spending group
  • Frequency of spending
  • Family Area
  • Single People
  • Average basket size
  • Average shopper duration time
  • Cultural aspects
Models / Recommendations
  • Model for income group
  • Model for domain
  • Model for age group
Happy Retailing!!!

April 20, 2019

Artificial Intelligence (AI) Podcast - Lex Fridman

Excellent talk with deep tech conversations, principles and thought process. Some of the questions, interesting lines I liked from the podcast
  • AI Assisted driving for a safer and better world
  • Dream of Autopilot - Autonomy revolution
  • Design Choices - Instrument Cluster, Display, Sensor suites
  • Display - Health check on vehicles perception of reality
  • Inputs - Camera, Radar, Ultrasonics, GPS
  • Information rendered into vector space with lane lines, traffic lights
  • Vector space re-rendered on display for people to understand the system
  • Considered parts / Uncertainties - Road Segmentation, Vehicle detection, object detection other techniques underlying
  • Debug Views - Augmented Vision with boxes, labels, Visualizer vector space representation from all sensors
  • Technical Aspects, Neural Network, Data, Hardware to allocate resources
  • Data - Vast amounts (12 ultrasonic sensors, GPS, IMU), 400K cars on the road
  • The massive inflow of data
  • Full self-driving computer development in progress
  • Cameras at FULL frame rate, FULL Resolution 
  • Driving - Learn from Edge Cases
  • Autopilot disengagements - Aspects / Ideas
  • Take over for convenience / Optimal spline for traversing the intersection
  • Navigate complex intersection
  • Lane change based /freeway/highway interchange 
  • Automatically overtake slow cars
  • Exit freeway
  • Full Self Driving Computer in Production
  • Tesla is Appreciating Asset
  • Navigate Parking Lots
  • Metric (Incidents per mile)
  • Assess the probability of a crash, injury, permanent injury, death
  • Video of faces/body
  • Moving from Elevator support to Automatic Elevator
  • Body Pose, Cognitive Load
  • Camera-based driver monitoring
  • More reliable than human then Driver Monitoring won't help much
  • Operational Design Domain
  • Instrument Cluster Display, Capabilities
  • Neural Net - Basic Bunch of Matrix Math
  • Learn both on valid and invalid data
  • What is a car
  • What is definitely not a car
  • Key ideas for Artificial General Intelligence
  • Tesla Goal - World's best Self Driving Vehicle
  • AI will convince to fall in love with it

Happy Mastering DL!!!

April 16, 2019

Day #242 - Working with labelmg

Download labelmg from link

Key things 

  • Open Directory, Create Bounding Box, Label the object, Save the bounding box
  • XML will be generated with the coordinates




Happy Mastering DL!!!

Day #241 - Tensorflow on CPU - Object Detection






Finally was able to train Custom Object Detection



Notes on Custom Object Detection (Notes - Link )
Step #1 - Define Inputs - Specify files in TFRecord file format
Step #2 - Configure Train_config. Key Values are

  • Model parameter initialization.
  • Input preprocessing.
  • SGD parameters.

Step #3 - fine_tune_checkpoint should provide a path to the pre-existing checkpoint, To speed up the training process, it is recommended that users re-use the feature extractor parameters from a pre-existing image classification or object detection checkpoint
Step #4 - SGD - hyperparameters for gradient descent
Step #5 - Evaluator Config)

To get reasonable mAP@IoU scores for object detection API:

1. Try varying the Intersection over Union (IoU) threshold, e.g 0.2-0.5 and see if you get an increase in average precision. You would have to modify matching_iou_threshold parameter in object_detection/utils/object_detection_evaluation.py

2. Try different evaluator classes (the default one is EVAL_DEFAULT_METRIC = 'pascal_voc_detection_metrics'). If you are training on Open Image Dataset it makes sense to use open_images_V2_detection_metrics

3. Check your eval config file and increase the number of examples used in the evaluation set, e.g.

eval_config: {
  num_examples: 20000
  num_visualizations: 16
  min_score_threshold: 0.2
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 1
}

4. Train the object detector for more iterations
5. Check current mAP against reported metrics (e.g. COCO mAP@IoU=0.5)

Step by Step: Build Your Custom Real-Time Object Detector  - Link
Detectron2 Train a Instance Segmentation Model
Installing the Tensorflow Object Detection API

For custom object training BMW has shared their opensource framework. It is a packaged version of the complete object detection setup. (Yolo / TensorFlow this is good set of tools)


I haven't experimented with it. This is a good place to leverage the setup as common tool.  This was released few months back. I am working in my windows setup for a while.

Happy Mastering DL!!!

April 15, 2019

Day #240 - Setting up Tensorflow GPU on windows 10

The post is based on the session link . I have made few changes and updates on the same




Happy Mastering DL!!!

Day #239 - Home Depot Retail Data Science Cases

Key Lessons
  • 45% percent of online orders picked up from stores
  • Data Science for better search, recommendation, personalization
  • Product search - similar product search with images
  • Personalization - bought together, sold together
  • Weather, Seasonality, Trends
  • Segmentation by product, division
  • Crowd behavior iin-store(Retain-store level analytics)
  • Relevancy of Search Engine










Happy Mastering DL!!!

April 12, 2019

Day #238 - Working with coco dataset

coco - common objects in context

Installation Steps https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md

Packages
pip install Cython
pip install contextlib2
pip install matplotlib
pip install pycocotools
pip install scikit-image
pip install --upgrade scikit-image

Demo Code (Minor Changes)



Happy Mastering DL!!!

April 11, 2019

Day #237 - Working on Car Detection - OpenVino

Detailed steps are mentioned in link






Happy Mastering DL!!!

April 09, 2019

Day #236 - Papers on Person Re-Identification

Paper #1 - Camera Style Adaptation for Person Re-identification

Key Lessons
  • Person Reidentification - Given Query Person, Retrieve person from multiple sources
  • Challenges - Resolution, Environment, Illumination
  • Camera Style Adaptation Approach - unsupervised, camera-invariant property
Techniques
  • Input image pairs are partitioned into three overlapping horizontal parts respectively, and through a siamese CNN model to learn the similarity of them using cosine distance
Paper #2 - SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC
Techniques
  • Kalman filtering in image space and frame by frame
  • Kalman filter with constant velocity motion
Paper #3 - In Defense of the Triplet Loss for Person Re-Identification
Techniques
  • A plain CNN with a triplet loss 
Triplet Loss
Key Lessons
  • Look at Anchor, Distance with Positive Example, Distance with Negative Example
  • 3 Images at a time Anchor, Positive, Negative Image
  • APNN
  • d(A,P) = 0.5 Set Margin to achieve it for positive / negative
  • L(A,P,N) = Max(||f(A)-f(P)||^2 - ||f(A)-f(N)||^2 + Alpha)
  • Chosing Triplets Randomly
  • Map Training Set into Triple
Example - Link1, Link2



Happy Mastering DL!!!

Day #235 - PyTorch developer conference part 1

Session #1 - Engineering Practices for Software 2.0
Key Lessons
  • New Programming Paradigm for Neural Networks
  • SGD writes code in weights of neural network
  • Tune Dataset, Tune model architecture, Tune the optimization
  • NN in Tesla for Autopilot
Best Practices for 2.0 Stack
  • Test Driven Development Workflow - Test set manually created, clean, Carefully curated test set
  • CI Workflow - Automate build - Unit Tests - Automate Deployment
  • Dataset is part of code - Automate Neural Network Training Jobs - Compile into Weights - Automate Deployments
  • Timestamp your data
  • Mono-repos in practice
Session #2 - Applied Deep Learning
Key Lessons
  • Many Research Projects use PyTorch
  • Pytorch - Simple, Extensible, Fast
Projects
  • Deep Learning SuperSampling - New GPU, Realtime better graphics
  • NN for super resolution
  • DL for real time graphics
  • Inpainting. http://research.nvidia.com/inpainting
  • Image and Video Synthesis - https://github.com/NVIDIA/vid2vid, Create videos with temporal consistency
  • Frame prediction, Optical flow, Historical data, Predict Sampling Kernel
  • Wavenet - Model for generating audio samples
  • Pytorch extension Apex for mix precision training
Session #3 - NLP Transfer Learning
Key Lessons
  • Making more general NLP Systems
  • Related tasks tend to help each other
  • Decanlp.com
NLP Projects
  • Question Answering
  • Machine Translation
  • Summarization
  • Sentiment Classification
  • Semantic Role Labeling
  • Semantic Parsing
  • Commonsense Reasoning
Techniques
  • Transfer Learning
  • Weight Sharing
  • Zero Shot Learning
  • Data Augmentation
  • Domain Adaptation
  • Multi-task learning
Approach
  • Seq2seq model
  • Classification, Extraction, Generation
  • Domain Adaption
  • Some ZeroShot
Sesson #4 - Deep Universal Probablistic Programming
Key Lessons
  • Pyro - Probablistic Programming Language
  • Modern Bayesian ML methods
  • NN for modelling and inference
  • Universal, Scalable, Flexible and minimal
  • 3 Layer Architecture with Probablistic Programming interface
  • Inference Algo on top of library
  • Stochastic Variational Inference 
To be continued from 00:55:00 rest of Session


Happy Mastering DL!!!

April 06, 2019

How I evaluate data science candidate?

  • Different business problems solved and their ML lessons learned, Deep Dive on Implementation, Algo used, Features Evaluated
  • Data pipeline set up and challenges faced
  • How do you keep track of new papers / evaluating and learning different frameworks
  • How much do you code on a daily basis for work / personal learning
  • Ability to bring different perspective/techniques solving problems
The field is evolving on a daily basis. We need passionate, curious learners and experimentation mindset!!!

April 05, 2019

Day #236 - Save Keras in Tensorflow pb format

This project was useful for conversion from Keras to Tensorflow pb format

Command
python keras_to_tensorflow.py --input_model="path/to/keras/model.h5" --output_model="path/to/save/model.pb"

Example
python keras_to_tensorflow.py --input_model="D:\\classification_3.h5" --output_model="D:\\classification_3.model.pb"

Happy Learning!!!

2.0 Lifestyle Skills

To survive we need a newer set of skills and a better awareness about yourself
  • Building Culture of Learning
  • Training and Experimenting Mindset
  • Emotional and Communication Skills
  • Fail and Learn Mindset
  • Balance Attitude dealing with Depression, Life Struggles
Happy Finding Yourself!!!

Finding Great Candidates

  • Communicate at the simplistic level
  • Create end to end experiments than certifications
  • Rely on passion, Consistent learning and good team players
  • Look for people who intend to make a change, consistent performance matters 
  • Move out of puzzles, programs. Project or a prototype that requires reasonable design, code, use cases, an end to end implementation matters
  • Puzzles and program can find a good coder but doing end to end projects requires more skills than just coding
  • People who share what they learn can impact a change in culture than people who work in silos
  • Great Skills takes years, Passionate about technology to see how it evolves matters
Happy Learning!!!

April 04, 2019

April 03, 2019

Day #234 - NLP with Deep Learning | Winter 2019 | Lecture 1

Key Lessons
  • Get better in finding words that make them feel less alone
  • Writing is ability to communicate knowledge, Knowledge sent to places
  • Writing is 5000 years old
  • Meaning - Expression for Idea, Art, Writing
  • Use NLTK for Synonyms and Hypernyms
  • Wordnet fine distinction between senses of word
  • Words represented as one hot vectors
  • Building word similarities tables to map to similar words
  • Dense Vector - Word Embeddings Representation
Word2vec

  • Framework for learning word vectors
  • Every word represented by vector
  • c - center word, o - context outside word
  • Calculate the probability
  • Similarity between words the orange part
  • Exp turn positive or negative into number
Maths
  • Calculus Chain Rule
  • Vector Dot product
  • Multivariate calculus


Happy Mastering DL!!!

April 02, 2019

Day #233 - Tensorflow 2.0 notes

Summary of Notes
  • Adopted Keras for high level API, tf.keras
  • Common Pieces for - layers, models, optimizers
  • Keras - Pythonic and Easy to learn
  • For Larger Scale data, Estimators used - For Fault Tolerance
  • Estimators are powerful machines, All estimators moved to keras
  • 1.0 - No Session, 2.0 Eager mode
  • Graphs even in eager context
  • Eager execution is a way to train a Keras model without building a graph
  • One set of Optimizers, Full Serializeable
  • Losses consolidated into single set
  • RNN layers update in Tensorflow, Unified RNN layers
  • Tensorboard for Performance profiling, Model performance
  • tf.distribute.Strategy API - Designed to handle many distribution architectures (Multi-Gpu)
To Update
pip install -q tensorflow==2.0.0-alpha0



Happy Mastering DL!!!

Day #233 - Pytorch Examples

Happy Mastering DL!!!

Day #232 - Kafka + Spark Integration - Big Data Setup - Part I

Experimenting with Kafka and Spark using Pyspark

Example 1 - Kafka Publish - Consume
Example 2 - Kafka Publish - Spark Consume

Happy Learning!!!

April 01, 2019

Day #231 - Evaluating Existing Pytorch - ReId - Models

On Ubuntu System
  • Download Market 1501 Dataset - Link
  • Download Code from Link
  • Comment CUDA References 



  • Run the code



Happy Mastering DL!!!