"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

April 30, 2019

Data story behind Food delivery Apps

Since I use food delivery apps heavily. Both swiggy, ubereats. My views and reflections of data story/measures/ machine learning use cases from these applications

My observations based on application use. Ubereats highlights below activities based on historical data collected
  • Previously ordered restaurants 
  • Previously ordered items highlighted
  • Review based listings 
  • Projecting estimated delivery times
I personally face challenges while trying to shift to a low-calorie diet as recommendations are more tuned for past orders.
  • Recommending a similar item every day from other restaurants based on historical data
  • No option to set preferences for the coming week - Balanced diet customized to need /preferences based on user choices for a week 
  • Fold quality issues exist no matter how good review ratings are
I have worked on real-time systems, reporting and moved to AI. Now we have all tools to query data in motion, historical data and future data forecast. This view provides a complete end to end perspective to understand data, numbers. Some of below metrics/ measure overlap across transactions/ historical data / ai

Key Metrics / Measures
  • Average Order delivery time at different times (Morning / Lunch / Evening / Holidays / Weekends)
  • Average Order Order pickup time at different times
  • Order acceptance rate
  • Clicks/ conversions 
  • A/B experiment and conversions 
  • Payments type vs orders
  • Average menu browsing time
  • Frequently searched items across days / restaurants / seasons
  • Predict order delays using Traffic data
  • Peak seller's 
  • Top customers 
  • Weekday trends
  • Top trends based on seasonality 
Data science use cases
  • Forecast on volumes of items based on historical data 
  • OCR, Recommendation at User Level / Sold together items
  • Deep learning for automated food classification, tagging
  • Segmenting customers based on Age / Gender / Veg / Non-Veg / Cusine Choices and providing recommendations
  • Forecast Order Volumes and assign Delivery partners based on Projected numbers to reduce other delays
Tech Talk - 


Everything that is measurable can be managed, monitored, improved. There have to be more quality aspects to be integrated as we risk ourselves trusting rating for better quality. Hope quality bar keeps improving and story evolves into another version customizing based on personal diet plans and choices. Happy finding the data story behind these food delivery businesses Apps !!!

April 28, 2019

Day #245 - Exploratory Data Analysis Goals

  • Find Insights, Turn it into Why-questions
  • Seek Surprises sudden peaks, lows, Harness it into How Questions
  • Plot data in different dimensions Month / Year / Sales, Find Insights in every perspective
Learn the story behind the numbers!!!

Happy Mastering DL!!!

Innovation Session Notes

Today was interesting. Attended Innovation Session. Very inspiring, motivational session. Thanks Akshay Cherian.
  • Communicating to brain
  • Communicating with emotion
Different types of Questions
  • Know How
  • Know When
  • Know What
  • Know Why
Great Learnings / Habits
  • Listening at different levels
  • Insights learnt
  • Find Insights, Turn it into question
  • Seek Surpises, Harness it
  • Insights + States of Flow loop on each other
  • We can perform when it keeps us exciting and not overwhelming
  • Creativity is presence of constraints
  • Create disproportionate value
  • Solve by doing, Comfortable with failure
  • It will hurt a bit if you are doing something meaningful
Books
Geography of Genius
States of Flow Assessment
Steps
  • List, Cluster, Reorder
  • Start Challenge vs Idea
  • Turn Top barriers into Questions
  • Rate Idea Valuable / Simple
  • Turn Ideas into Steps
  • Review Ideas
  • Create and Share output
Fresher Tips
  • Work for free until they see value
Challenges / Problems / Opportunities
  • Redefinition of problem 
  • Think from Celebration
  • Pissed of is better than passion
  • Treat it as a Games
  • Breakthrough tools
  • Breakthrough environment
  • Don't define the problem by Single word
  • Meeting for Evolution not Evaluation of ideas
  • Reduce perceived Risk of Sharing
Futuristic
  • Don't operate from crisis to crisis
  • Anticipate and prepare for next-gen risks
  • Plan a budget for time and money
  • Manage time and invest time differently
  • Learn a combination of skills
Key Lessons
  • Like how we pay Monthly Bills, How much did you budget your time and money for learning, upskill or coding. I felt slap on face with this question.
  • If someone is in a role where you want to be. Look at his previous role and see what he has done to move to that role
  • If someone is successful see the pattern, practice they follow don't put it as luck, motivation. Emulate them
Sharing my Earlier Work

Why it fails


Happy Learning!!!

April 27, 2019

Day #244 - Data Annotation Guidelines

  • Quality of Images and capturing significant traits like styles / shapes / colors
  • Object to annotate captured from nearest possible view (Best Possible Angle)
  • Impact of poor background light / night and too far images. Discard low quality / poor miniature of objects (Occurs in edges of image)
  • Handling partial objects
  • When Annotating multiple objects the class imbalance factors between them, Fix before training. Analyzing Number of Objects, Occurrences - Distribution for Sampling balances
  • Check for Data set impacts for Daylight / Night and annotate / train / build model accordingly
Guideline - The object under training occupies the center spot / nearest closest better view to know the side / front or reasonably good amount of features like color / styles

Good data / quality data is as important than the model / approach we take.

Happy Mastering DL!!!

April 26, 2019

Day #243 - Retail Analytics Opportunities

Notes from Recently Attended Session
Instore Retail
  • Store Managers
  • Associates usage
  • Online / Offline Users
Customer Experience
  • Optimize layout
  • Improve promotion effectiveness
  • Shopper Journey Outcome
Store Performance
  • Segmentation 
  • Forecasting
Store Managers
  • People Person
  • Genuine interest for customers, supervising, nurture ideas
  • People Skills, Sales Skills, Management Skills
Store Manager
  • Stocking
  • Delegate Activity
  • Customer Service
  • Sales projections
  • Readiness
As Required
  • Forecasting
  • Staffing
  • Hiring
  • Associate Performance
Challenges
  • Difficult Customers
  • Personal issues
  • Customer Expectations
  • Not get yelled at
Sale
  • Conversation that ends in a transaction
  • Know better, focus on training people
  • Product knowledge
Apple Selling Philosophy
  • A - Approach (Welcome Approach)
  • P - Probe - Needs
  • P - Provide Solutions
  • L - Listen Concerns / Issues
  • E - End with farewell / Invitation
Power Hours - Most Sale time / peak hours
  • Sales Split by hour
  • Sales Split by Week
  • Labour vs Traffic Approach
  • %% of sales at that Hour
Types of Retailers
  • Malls - Retailers - Pantaloon / Shoppers Shop
  • Individual Stores - Kirana Stores
  • Online Sales - Amazon / Flipkart
  • Chain of Stores - Pothys / Chennai Silks (Have own sourcing units) - Brand Conscious
Power Centers
  • Maximum Value Proposition
  • Competitive prices
  • Volume & competitive price
Factors to Setup Stores
  • Demographics
  • Income levels
  • Spending group
  • Frequency of spending
  • Family Area
  • Single People
  • Average basket size
  • Average shopper duration time
  • Cultural aspects
Models / Recommendations
  • Model for income group
  • Model for domain
  • Model for age group
Happy Retailing!!!

April 20, 2019

Artificial Intelligence (AI) Podcast - Lex Fridman

Excellent talk with deep tech conversations, principles and thought process. Some of the questions, interesting lines I liked from the podcast
  • AI Assisted driving for a safer and better world
  • Dream of Autopilot - Autonomy revolution
  • Design Choices - Instrument Cluster, Display, Sensor suites
  • Display - Health check on vehicles perception of reality
  • Inputs - Camera, Radar, Ultrasonics, GPS
  • Information rendered into vector space with lane lines, traffic lights
  • Vector space re-rendered on display for people to understand the system
  • Considered parts / Uncertainties - Road Segmentation, Vehicle detection, object detection other techniques underlying
  • Debug Views - Augmented Vision with boxes, labels, Visualizer vector space representation from all sensors
  • Technical Aspects, Neural Network, Data, Hardware to allocate resources
  • Data - Vast amounts (12 ultrasonic sensors, GPS, IMU), 400K cars on the road
  • The massive inflow of data
  • Full self-driving computer development in progress
  • Cameras at FULL frame rate, FULL Resolution 
  • Driving - Learn from Edge Cases
  • Autopilot disengagements - Aspects / Ideas
  • Take over for convenience / Optimal spline for traversing the intersection
  • Navigate complex intersection
  • Lane change based /freeway/highway interchange 
  • Automatically overtake slow cars
  • Exit freeway
  • Full Self Driving Computer in Production
  • Tesla is Appreciating Asset
  • Navigate Parking Lots
  • Metric (Incidents per mile)
  • Assess the probability of a crash, injury, permanent injury, death
  • Video of faces/body
  • Moving from Elevator support to Automatic Elevator
  • Body Pose, Cognitive Load
  • Camera-based driver monitoring
  • More reliable than human then Driver Monitoring won't help much
  • Operational Design Domain
  • Instrument Cluster Display, Capabilities
  • Neural Net - Basic Bunch of Matrix Math
  • Learn both on valid and invalid data
  • What is a car
  • What is definitely not a car
  • Key ideas for Artificial General Intelligence
  • Tesla Goal - World's best Self Driving Vehicle
  • AI will convince to fall in love with it

Happy Mastering DL!!!

April 16, 2019

Day #242 - Working with labelmg

Download labelmg from link

Key things 

  • Open Directory, Create Bounding Box, Label the object, Save the bounding box
  • XML will be generated with the coordinates




Happy Mastering DL!!!

Day #241 - Tensorflow on CPU - Object Detection


Tensorflow on CPU
===================
Follow all steps in previous article,
#Not Needed CUDA
Step 2 - Cleanup Tensorflow
=============================
#Had to manually goto folder remove all packages named tf, tensorflow, tensorboard
#C:\Users\XXXXXX\AppData\Local\Continuum\anaconda3\envs\tflow\Lib\site-packages
pip install --ignore-installed --upgrade tensorflow
conda install jupyter
conda install scipy
Step 3 - Custom Training
=========================
Goto Link https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Download http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
C:\Tensorflow1\models\research\object_detection\faster_rcnn_inception_v2_coco_2018_01_28
Download code from https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
Extract to C:\Tensorflow1\models\research\object_detection\TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10-master
Replace in C:\Tensorflow1\models\research\object_detection
Delete file in C:\Tensorflow1\models\research\object_detection\training, C:\Tensorflow1\models\research\object_detection\inference_graph, C:\Tensorflow1\models\research\object_detection\images (test and training label xls files)
Step#4 - Command
==================
cd C:\Tensorflow1\models\research
protoc --python_out=. .\object_detection\protos\anchor_generator.proto .\object_detection\protos\argmax_matcher.proto .\object_detection\protos\bipartite_matcher.proto .\object_detection\protos\box_coder.proto .\object_detection\protos\box_predictor.proto .\object_detection\protos\eval.proto .\object_detection\protos\faster_rcnn.proto .\object_detection\protos\faster_rcnn_box_coder.proto .\object_detection\protos\grid_anchor_generator.proto .\object_detection\protos\hyperparams.proto .\object_detection\protos\image_resizer.proto .\object_detection\protos\input_reader.proto .\object_detection\protos\losses.proto .\object_detection\protos\matcher.proto .\object_detection\protos\mean_stddev_box_coder.proto .\object_detection\protos\model.proto .\object_detection\protos\optimizer.proto .\object_detection\protos\pipeline.proto .\object_detection\protos\post_processing.proto .\object_detection\protos\preprocessor.proto .\object_detection\protos\region_similarity_calculator.proto .\object_detection\protos\square_box_coder.proto .\object_detection\protos\ssd.proto .\object_detection\protos\ssd_anchor_generator.proto .\object_detection\protos\string_int_label_map.proto .\object_detection\protos\train.proto .\object_detection\protos\keypoint_box_coder.proto .\object_detection\protos\multiscale_anchor_generator.proto .\object_detection\protos\graph_rewriter.proto
python setup.py build
python setup.py install
cd C:\Tensorflow1\models\research\object_detectiona
jupyter notebook object_detection_tutorial.ipynb




Finally was able to train Custom Object Detection



Notes on Custom Object Detection (Notes - Link )
Step #1 - Define Inputs - Specify files in TFRecord file format
Step #2 - Configure Train_config. Key Values are

  • Model parameter initialization.
  • Input preprocessing.
  • SGD parameters.

Step #3 - fine_tune_checkpoint should provide a path to the pre-existing checkpoint, To speed up the training process, it is recommended that users re-use the feature extractor parameters from a pre-existing image classification or object detection checkpoint
Step #4 - SGD - hyperparameters for gradient descent
Step #5 - Evaluator Config)

To get reasonable mAP@IoU scores for object detection API:

1. Try varying the Intersection over Union (IoU) threshold, e.g 0.2-0.5 and see if you get an increase in average precision. You would have to modify matching_iou_threshold parameter in object_detection/utils/object_detection_evaluation.py

2. Try different evaluator classes (the default one is EVAL_DEFAULT_METRIC = 'pascal_voc_detection_metrics'). If you are training on Open Image Dataset it makes sense to use open_images_V2_detection_metrics

3. Check your eval config file and increase the number of examples used in the evaluation set, e.g.

eval_config: {
  num_examples: 20000
  num_visualizations: 16
  min_score_threshold: 0.2
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 1
}

4. Train the object detector for more iterations
5. Check current mAP against reported metrics (e.g. COCO mAP@IoU=0.5)

Step by Step: Build Your Custom Real-Time Object Detector  - Link
Detectron2 Train a Instance Segmentation Model
Installing the Tensorflow Object Detection API

For custom object training BMW has shared their opensource framework. It is a packaged version of the complete object detection setup. (Yolo / TensorFlow this is good set of tools)


I haven't experimented with it. This is a good place to leverage the setup as common tool.  This was released few months back. I am working in my windows setup for a while.

Happy Mastering DL!!!

April 15, 2019

Day #240 - Setting up Tensorflow GPU on windows 10

The post is based on the session link . I have made few changes and updates on the same

Reference links - https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
Step #1 Environment Creation
==============================
conda create -n tensorflowgpu python=3.6.3 anaconda
Step #2 - Activation
=====================
activate tensorflowgpu
Deactivation
==============
deactivate tensorflowgpu
Getting rid of it
===================
remove -n tensorflowgpu --all
conda env remove --name tensorflowgpu
conda info --envs
Step 3 - Package Setup
========================
pip install Cython
pip install contextlib2
pip install pillow
pip install lxml
pip install jupyter
pip install jupyter notebook
pip install matplotlib
pip install protobuf
pip install pycocotools
conda install git
pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
pip install pandas
pip install opencv-python
pip install jupyter
conda install spyder
conda install tensorflow-gpu
pip install tensorboard==1.12.2
Step 4 - CUDA Setup
=====================
Install Visual Studio required components
https://docs.openvinotoolkit.org/latest/_docs_install_guides_installing_openvino_windows.html
Reinstall NSight for Visual Studio 2017
Step 5 - Model Setup
=======================
Download https://github.com/tensorflow/models and unzip to C:\Tensorflow1\models
cd C:\Tensorflow1\models\research
Step 6 - Setting up Object Detection
=======================================
for /f %i in ('dir /b object_detection\protos\*.proto') do protoc object_detection\protos\%i --python_out=.
Step 7 - Setting Config Paths
==================================
SET PYTHONPATH=C:\Tensorflow1\models\research\slim;C:\Tensorflow1\models;C:\Tensorflow1\models\research
SET PATH = %PATH%;PYTHONPATH
ECHO %PATH%
Command
cd C:\Tensorflow1\models\research
Execute Command
protoc --python_out=. .\object_detection\protos\anchor_generator.proto .\object_detection\protos\argmax_matcher.proto .\object_detection\protos\bipartite_matcher.proto .\object_detection\protos\box_coder.proto .\object_detection\protos\box_predictor.proto .\object_detection\protos\eval.proto .\object_detection\protos\faster_rcnn.proto .\object_detection\protos\faster_rcnn_box_coder.proto .\object_detection\protos\grid_anchor_generator.proto .\object_detection\protos\hyperparams.proto .\object_detection\protos\image_resizer.proto .\object_detection\protos\input_reader.proto .\object_detection\protos\losses.proto .\object_detection\protos\matcher.proto .\object_detection\protos\mean_stddev_box_coder.proto .\object_detection\protos\model.proto .\object_detection\protos\optimizer.proto .\object_detection\protos\pipeline.proto .\object_detection\protos\post_processing.proto .\object_detection\protos\preprocessor.proto .\object_detection\protos\region_similarity_calculator.proto .\object_detection\protos\square_box_coder.proto .\object_detection\protos\ssd.proto .\object_detection\protos\ssd_anchor_generator.proto .\object_detection\protos\string_int_label_map.proto .\object_detection\protos\train.proto .\object_detection\protos\keypoint_box_coder.proto .\object_detection\protos\multiscale_anchor_generator.proto .\object_detection\protos\graph_rewriter.proto
Step 8 - Demo
==============
cd C:\Tensorflow1\models\research\object_detection
jupyter notebook object_detection_tutorial.ipynb



Happy Mastering DL!!!

Day #239 - Home Depot Retail Data Science Cases

Key Lessons
  • 45% percent of online orders picked up from stores
  • Data Science for better search, recommendation, personalization
  • Product search - similar product search with images
  • Personalization - bought together, sold together
  • Weather, Seasonality, Trends
  • Segmentation by product, division
  • Crowd behavior iin-store(Retain-store level analytics)
  • Relevancy of Search Engine










Happy Mastering DL!!!

April 12, 2019

Day #238 - Working with coco dataset

coco - common objects in context

Installation Steps https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md

Packages
pip install Cython
pip install contextlib2
pip install matplotlib
pip install pycocotools
pip install scikit-image
pip install --upgrade scikit-image

Demo Code (Minor Changes)
#https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoDemo.ipynb
#Windows
#Python 3+ Environment
import matplotlib.pyplot as plt
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import numpy as np
from skimage import io
anntype = ['segm','bbox','keypoints']
anntype = anntype[1]
prefix = 'person_keypoints' if anntype =='keypoints' else 'instances'
print('Demo for %s'%anntype)
#Download Annotations from http://images.cocodataset.org/annotations/annotations_trainval2014.zip
#Download to location C:\datadir
#Initialize coco ground truth API
datadir = r'C:\datadir\annotations_trainval2014'
datatype = 'val2014'
annfile = '%s/annotations/%s_%s.json'%(datadir,prefix,datatype)
coco = COCO(annfile)
#display categories
cats = coco.loadCats(coco.getCatIds())
nms = [cat['name'] for cat in cats]
print('COCO categories: \n{}\n'.format(' '.join(nms)))
nms = [cat['supercategory'] for cat in cats]
print('COCO super categories: \n{}\n'.format(' '.join(nms)))
catids = coco.getCatIds(catNms=['person','dog','skateboard'])
imgids = coco.getImgIds(catIds=catids)
imgids = coco.getImgIds(imgIds=[324158])
img = coco.loadImgs(imgids[np.random.randint(0,len(imgids))])[0]
I = io.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()
plt.imshow(I)
plt.axis('off')
annids = coco.getAnnIds(imgIds=img['id'],catIds=catids,iscrowd=None)
anns = coco.loadAnns(annids)
coco.showAnns(anns)
plt.show()
view raw cocodataset.py hosted with ❤ by GitHub



Happy Mastering DL!!!

April 11, 2019

Day #237 - Working on Car Detection - OpenVino

Detailed steps are mentioned in link

https://github.com/intel-iot-devkit/smart-video-workshop/tree/master/object-detection
Detailed steps are mentioned in https://github.com/intel-iot-devkit/smart-video-workshop/tree/master/object-detection
Part I
========
sudo mkdir -p /opt/intel/workshop/
sudo chown ubuntu.ubuntu -R /opt/intel/workshop/
cd /opt/intel/workshop/
git clone https://github.com/intel-iot-devkit/smart-video-workshop.git
export CV=/opt/intel/workshop/smart-video-workshop/
Part II
========
mkdir -p mobilenet-ssd/FP32
cd /opt/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer
cd /opt/intel/workshop/smart-video-workshop/object-detection
so ./downloader.py –name mobilenet-ssd
cd /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/
sudo python3 mo.py --input_model /opt/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_downloader/object_detection/common/mobilenet-ssd/caffe/mobilenet-ssd.caffemodel -o /opt/intel/workshop/smart-video-workshop/object-detection --scale 256 --mean_values [127,127,127]
cd /opt/intel/workshop/smart-video-workshop/object-detection
sudo apt-get install libgflags-dev
/home/ubuntu/code/open_model_zoo-master/demos/build/intel64/Release/lib
eval echo "~$ubuntu"
/opt/intel/computer_vision_sdk/inference_engine
PartIII
=========
#Edit MakeFile Contents
all:
g++ -fPIE -O3 -o tutorial1 --std=c++11 main.cpp -I. \
-I/opt/intel/openvino/opencv/include/ \
-I/opt/intel/computer_vision_sdk/inference_engine/include/ \
-I/opt/intel/computer_vision_sdk/inference_engine/include/cpp \
-L/opt/intel/computer_vision_sdk/inference_engine/lib/intel64 -linference_engine -ldl -lpthread \
-L/opt/intel/openvino/computer_vision_sdk/opencv/lib -lopencv_core -lopencv_imgcodecs -lopencv_imgproc -lopencv_highgui -lopencv_videoio -lopencv_video -lgflags -I/opt/intel/computer_vision_sdk/inference_engine/include -I/opt/intel/computer_vision_sdk/inference_engine/samples/ -I./ -I/opt/intel/computer_vision_sdk/inference_engine/samples/common/format_reader/ -I/opt/intel/openvino/computer_vision_sdk/opencv/include -I/usr/local/include -I/opt/intel/computer_vision_sdk/inference_engine/samples/thirdparty/gflags/include -I/opt/intel/computer_vision_sdk/opencv/include -I/opt/intel/computer_vision_sdk/opencv/include/cpp -I/opt/intel/computer_vision_sdk/inference_engine/samples/extension -L/opt/intel/computer_vision_sdk/inference_engine/bin/intel64/Release/lib -L/opt/intel/computer_vision_sdk/inference_engine/lib/ubuntu_16.04/intel64 -L/opt/intel/workshop/smart-video-workshop/object-detection/lib -L/opt/intel/computer_vision_sdk/opencv/lib -ldl -linference_engine -lopencv_highgui -lopencv_core -lopencv_imgproc -lopencv_videoio -lopencv_imgcodecs -L/opt/intel/workshop/smart-video-workshop/object-detection/lib
make
./tutorial1 -i /home/ubuntu/code/Cars.mp4 -m /opt/intel/workshop/smart-video-workshop/object-detection/mobilenet-ssd.xml
view raw gistfile1.txt hosted with ❤ by GitHub





Happy Mastering DL!!!

April 09, 2019

Day #236 - Papers on Person Re-Identification

Paper #1 - Camera Style Adaptation for Person Re-identification

Key Lessons
  • Person Reidentification - Given Query Person, Retrieve person from multiple sources
  • Challenges - Resolution, Environment, Illumination
  • Camera Style Adaptation Approach - unsupervised, camera-invariant property
Techniques
  • Input image pairs are partitioned into three overlapping horizontal parts respectively, and through a siamese CNN model to learn the similarity of them using cosine distance
Paper #2 - SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC
Techniques
  • Kalman filtering in image space and frame by frame
  • Kalman filter with constant velocity motion
Paper #3 - In Defense of the Triplet Loss for Person Re-Identification
Techniques
  • A plain CNN with a triplet loss 
Triplet Loss
Key Lessons
  • Look at Anchor, Distance with Positive Example, Distance with Negative Example
  • 3 Images at a time Anchor, Positive, Negative Image
  • APNN
  • d(A,P) = 0.5 Set Margin to achieve it for positive / negative
  • L(A,P,N) = Max(||f(A)-f(P)||^2 - ||f(A)-f(N)||^2 + Alpha)
  • Chosing Triplets Randomly
  • Map Training Set into Triple
Example - Link1, Link2



Happy Mastering DL!!!

Day #235 - PyTorch developer conference part 1

Session #1 - Engineering Practices for Software 2.0
Key Lessons
  • New Programming Paradigm for Neural Networks
  • SGD writes code in weights of neural network
  • Tune Dataset, Tune model architecture, Tune the optimization
  • NN in Tesla for Autopilot
Best Practices for 2.0 Stack
  • Test Driven Development Workflow - Test set manually created, clean, Carefully curated test set
  • CI Workflow - Automate build - Unit Tests - Automate Deployment
  • Dataset is part of code - Automate Neural Network Training Jobs - Compile into Weights - Automate Deployments
  • Timestamp your data
  • Mono-repos in practice
Session #2 - Applied Deep Learning
Key Lessons
  • Many Research Projects use PyTorch
  • Pytorch - Simple, Extensible, Fast
Projects
  • Deep Learning SuperSampling - New GPU, Realtime better graphics
  • NN for super resolution
  • DL for real time graphics
  • Inpainting. http://research.nvidia.com/inpainting
  • Image and Video Synthesis - https://github.com/NVIDIA/vid2vid, Create videos with temporal consistency
  • Frame prediction, Optical flow, Historical data, Predict Sampling Kernel
  • Wavenet - Model for generating audio samples
  • Pytorch extension Apex for mix precision training
Session #3 - NLP Transfer Learning
Key Lessons
  • Making more general NLP Systems
  • Related tasks tend to help each other
  • Decanlp.com
NLP Projects
  • Question Answering
  • Machine Translation
  • Summarization
  • Sentiment Classification
  • Semantic Role Labeling
  • Semantic Parsing
  • Commonsense Reasoning
Techniques
  • Transfer Learning
  • Weight Sharing
  • Zero Shot Learning
  • Data Augmentation
  • Domain Adaptation
  • Multi-task learning
Approach
  • Seq2seq model
  • Classification, Extraction, Generation
  • Domain Adaption
  • Some ZeroShot
Sesson #4 - Deep Universal Probablistic Programming
Key Lessons
  • Pyro - Probablistic Programming Language
  • Modern Bayesian ML methods
  • NN for modelling and inference
  • Universal, Scalable, Flexible and minimal
  • 3 Layer Architecture with Probablistic Programming interface
  • Inference Algo on top of library
  • Stochastic Variational Inference 
To be continued from 00:55:00 rest of Session


Happy Mastering DL!!!

April 06, 2019

How I evaluate data science candidate?

  • Different business problems solved and their ML lessons learned, Deep Dive on Implementation, Algo used, Features Evaluated
  • Data pipeline set up and challenges faced
  • How do you keep track of new papers / evaluating and learning different frameworks
  • How much do you code on a daily basis for work / personal learning
  • Ability to bring different perspective/techniques solving problems
The field is evolving on a daily basis. We need passionate, curious learners and experimentation mindset!!!

April 05, 2019

Day #236 - Save Keras in Tensorflow pb format

This project was useful for conversion from Keras to Tensorflow pb format

Command
python keras_to_tensorflow.py --input_model="path/to/keras/model.h5" --output_model="path/to/save/model.pb"

Example
python keras_to_tensorflow.py --input_model="D:\\classification_3.h5" --output_model="D:\\classification_3.model.pb"

Happy Learning!!!

2.0 Lifestyle Skills

To survive we need a newer set of skills and a better awareness about yourself
  • Building Culture of Learning
  • Training and Experimenting Mindset
  • Emotional and Communication Skills
  • Fail and Learn Mindset
  • Balance Attitude dealing with Depression, Life Struggles
Happy Finding Yourself!!!

Finding Great Candidates

  • Communicate at the simplistic level
  • Create end to end experiments than certifications
  • Rely on passion, Consistent learning and good team players
  • Look for people who intend to make a change, consistent performance matters 
  • Move out of puzzles, programs. Project or a prototype that requires reasonable design, code, use cases, an end to end implementation matters
  • Puzzles and program can find a good coder but doing end to end projects requires more skills than just coding
  • People who share what they learn can impact a change in culture than people who work in silos
  • Great Skills takes years, Passionate about technology to see how it evolves matters
Happy Learning!!!

April 04, 2019

Day #235 - Audio Analysis

#pip install librosa
#pip install python_speech_features
#librosa with python_speech_analysis
#Credits - https://github.com/librosa/librosa/issues/573
import librosa
import python_speech_features
from scipy.signal.windows import hann
n_mfcc = 13
n_mels = 40
n_fft = 512 # in librosa, win_length is assumed to be equal to n_fft implicitly
hop_length = 160
fmin = 0
fmax = None
#https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html
# y - Audio Time series
# sr - Sampling Rate
y, sr = librosa.load(r'E:\Audio_Analytics\test_data\1_street_music.wav')
#sr = 16000 # fake sample rate just to make the point
# librosa
#n_mfcc: int > 0 [scalar], number of MFCCs to return
mfcc_librosa = librosa.feature.mfcc(y=y, sr=sr, n_fft=n_fft,
n_mfcc=n_mfcc, n_mels=n_mels,
hop_length=hop_length,
fmin=fmin, fmax=fmax)
#https://python-speech-features.readthedocs.io/en/latest/
# python_speech_features
# no preemph nor ceplifter in librosa, so setting to zero
# librosa default stft window is hann
#winlen – the length of the analysis window in seconds. Default is 0.025s (25 milliseconds)
#winstep – the step between successive windows in seconds. Default is 0.01s (10 milliseconds)
#nfilt – the number of filters in the filterbank, default 26.
#Returns: A numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.
mfcc_speech = python_speech_features.mfcc(signal=y, samplerate=sr, winlen=n_fft / sr, winstep=hop_length / sr,
numcep=n_mfcc, nfilt=n_mels, nfft=n_fft, lowfreq=fmin, highfreq=fmax,
preemph=0, ceplifter=0, appendEnergy=False, winfunc=hann)
print(list(mfcc_librosa[:, 0]))
print(list(mfcc_speech[0, :]))

Happy Mastering DL!!!

April 03, 2019

Day #234 - NLP with Deep Learning | Winter 2019 | Lecture 1

Key Lessons
  • Get better in finding words that make them feel less alone
  • Writing is ability to communicate knowledge, Knowledge sent to places
  • Writing is 5000 years old
  • Meaning - Expression for Idea, Art, Writing
  • Use NLTK for Synonyms and Hypernyms
  • Wordnet fine distinction between senses of word
  • Words represented as one hot vectors
  • Building word similarities tables to map to similar words
  • Dense Vector - Word Embeddings Representation
Word2vec

  • Framework for learning word vectors
  • Every word represented by vector
  • c - center word, o - context outside word
  • Calculate the probability
  • Similarity between words the orange part
  • Exp turn positive or negative into number
Maths
  • Calculus Chain Rule
  • Vector Dot product
  • Multivariate calculus


Happy Mastering DL!!!

April 02, 2019

Day #233 - Tensorflow 2.0 notes

Summary of Notes
  • Adopted Keras for high level API, tf.keras
  • Common Pieces for - layers, models, optimizers
  • Keras - Pythonic and Easy to learn
  • For Larger Scale data, Estimators used - For Fault Tolerance
  • Estimators are powerful machines, All estimators moved to keras
  • 1.0 - No Session, 2.0 Eager mode
  • Graphs even in eager context
  • Eager execution is a way to train a Keras model without building a graph
  • One set of Optimizers, Full Serializeable
  • Losses consolidated into single set
  • RNN layers update in Tensorflow, Unified RNN layers
  • Tensorboard for Performance profiling, Model performance
  • tf.distribute.Strategy API - Designed to handle many distribution architectures (Multi-Gpu)
To Update
pip install -q tensorflow==2.0.0-alpha0



Happy Mastering DL!!!

Day #233 - Pytorch Examples

#Credits - https://github.com/hunkim/PyTorchZeroToAll/blob/master/05_linear_regression.py
import torch
from torch.autograd import Variable
x_data = Variable(torch.Tensor([[1.0],[2.0],[3.0]]))
y_data = Variable(torch.Tensor(([2.0],[4.0],[6.0])))
class Model(torch.nn.Module):
def __init__(self):
#Constructor
super(Model,self).__init__()
self.linear = torch.nn.Linear(1,1) #One in and One out
def forward(self,x):
#Variable of input data
#Variable of output data
y_pred = self.linear(x)
return y_pred
#Initialize model
model = Model()
#Loss Function and Optimizer
criterion = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(),lr=0.01)
total_loss = 0
#Training Loop
for epoch in range(500):
y_pred = model(x_data)
#compute and print loss
loss = criterion(y_pred,y_data)
print(epoch,loss.item())
#Zero gradients
optimizer.zero_grad()
loss.backward()
optimizer.step()
#After training
hour_var = Variable(torch.Tensor([[4.0]]))
y_pred = model(hour_var)
print('Predict after training',4,model(hour_var).data[0][0])
view raw Pytorch_LR.py hosted with ❤ by GitHub
#Credits - https://www.youtube.com/watch?v=wbJJudn-Xn0&list=PLX5lD3sNR32CELTjbVRNMUCUakEO1Lu0H&index=2
#https://github.com/hunkim/PyTorchZeroToAll/blob/master/10_1_cnn_mnist.py
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.autograd import Variable
import torch.optim as optim
#Load our dataset
train_dataset = datasets.MNIST(root=r'C:\Intel\Data',train=True,transform=transforms.ToTensor(),download=True)
test_dataset = datasets.MNIST(root=r'C:\Intel\Data',train=False,transform=transforms.ToTensor(),download=True)
batch_size=100
epochs = 10
#Dataset Iterable
train_load = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True)
test_load = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False)
print(len(train_dataset))
print(len(test_dataset))
print(len(train_load))
print(len(test_load))
#model class
class NET(nn.Module):
def __init__(self):
super(NET,self).__init__()
#First Layer
#Grey - One Channel
#Same Padding Input Size = Output Size
#Same Padding = (Filter-1)/2
self.cnn1 = nn.Conv2d(in_channels=1,out_channels=8,kernel_size=3,stride=1,padding=1)
self.batchnorm1 = nn.BatchNorm2d(8)
#Relu
self.relu = nn.ReLU()
self.maxpool1 = nn.MaxPool2d(kernel_size=2)
#After max pool feature map 28/2 = 14
self.cnn2 = nn.Conv2d(in_channels=8,out_channels=32,kernel_size=5,stride=1,padding=2)
#Output remains 14
self.batchnorm2 = nn.BatchNorm2d(32)
self.maxpool2 = nn.MaxPool2d(kernel_size=2)
#Feature map = 14/2 = 7
#32*7*7 = 1568
#(Input + output) / 2
# Arbitrary to choose
self.fc1 = nn.Linear(in_features=1568,out_features=600)
# Randomly Disables some Neurons
# Probability of Drop out 0.5
self.dropout = nn.Dropout(p=0.5)
self.fc2 = nn.Linear(in_features=600,out_features=10)
def forward(self,x):
out = self.cnn1(x)
out = self.batchnorm1(out)
out = self.relu(out)
out = self.maxpool1(out)
out = self.cnn2(out)
out = self.batchnorm2(out)
out = self.relu(out)
out = self.maxpool2(out)
# (batch size, 1568)
# 100 x 1568
out = out.view(-1,1568)
out = self.fc1(out)
out = self.relu(out)
out = self.dropout(out)
out = self.fc2(out)
return out
model = NET()
print(model)
loss_fn = nn.CrossEntropyLoss()
#Optimizers
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
iteration = 0
correct_nodata = 0
correct_data = 0
#Run for One iteration and check
for i,(inputs,labels) in enumerate(train_load):
if iteration==1:
break
inputs = Variable(inputs)
labels = Variable(labels)
print("for one iteration, this is what happens:")
print('Input Shape:',inputs.shape)
print('Labels Shape:',labels.shape)
output = model(inputs)
print('Outputs Shape:',output.shape)
_, predicted_nodata = torch.max(output,1)
print('predicted shape',predicted_nodata.shape)
print('predicted tensor', predicted_nodata)
correct_nodata += (predicted_nodata==labels).sum()
print('correct predictions',correct_nodata)
_, predicted_data = torch.max(output,1)
print('predicted shape',predicted_data.shape)
correct_data += (predicted_data==labels).sum()
print('predicted tensor', predicted_data)
print('correct predictions',correct_data)
iteration+=1
iter = 0
for epoch in range(epochs):
for i,(images,labels) in enumerate(train_load):
iter +=1
images = Variable(images)
labels = Variable(labels)
optimizer.zero_grad()
outputs = model(images)
loss = loss_fn(outputs,labels)
loss.backward()
optimizer.step()
#Test the model every 100 iterations
if (i+1)%100 ==0:
correct = 0
total = 0
for images, labels in test_load:
images = Variable(images)
output = model(images)
_,predicted = torch.max(outputs.data,1)
total += labels.size(0)
correct += (predicted==labels).sum()
print('total',total)
print('correct',correct)
accuracy = float(100.*(float(correct/total)))
#print('iteration:{}, train loss: {}, test accuracy: {}%'.format(iter,loss.data[0],accuracy))
print('Iteration')
print(iter)
print('Loss')
print(loss.item())
print('Accuracy')
print(accuracy)
print('Done!')
view raw pytorchmnist.py hosted with ❤ by GitHub
Happy Mastering DL!!!

Day #232 - Kafka + Spark Integration - Big Data Setup - Part I

Experimenting with Kafka and Spark using Pyspark

Example 1 - Kafka Publish - Consume
#KAFKA Producer
#Function Definition
from kafka import KafkaConsumer, KafkaProducer
def connect_kafka_producer():
_producer = None
try:
_producer = KafkaProducer(bootstrap_servers=['ip-XX-XX-XX-XX:9092'], api_version=(0, 10))
except Exception as ex:
print('Exception while connecting Kafka')
print(str(ex))
finally:
return _producer
#Kafka Publish message function
#Function Definition
def publish_message(producer_instance, topic_name, key, value):
try:
key_bytes = bytes(key, encoding='utf-8')
value_bytes = bytes(value, encoding='utf-8')
producer_instance.send(topic_name, key=key_bytes, value=value_bytes)
producer_instance.flush()
print('Cab Booking Request published successfully.')
except Exception as ex:
print('Exception in publishing message')
print(str(ex))
#Producer and send messages for a topic
#Function Invocation
kafka_producer = connect_kafka_producer()
for i in range(1,10):
for j in range(1,10):
print(i)
print(j)
message = str(i) + ',' + str(j)
print(message)
publish_message(kafka_producer, 'cab_request', 'UberGo',message)
#Read messages from topic
from kafka import KafkaConsumer
consumer = KafkaConsumer(bootstrap_servers='ip-XX-XX-XX-XX:9092', auto_offset_reset='earliest')
consumer.subscribe(['cab_request'])
print(consumer.partitions_for_topic('cab_request'))
for message in consumer:
print (message)
Example 2 - Kafka Publish - Spark Consume
#KAFKA Producer
#Function Definition
from kafka import KafkaConsumer, KafkaProducer
def connect_kafka_producer():
_producer = None
try:
_producer = KafkaProducer(bootstrap_servers=['ip-XX-XX-XX-XX:9092'], api_version=(0, 10))
except Exception as ex:
print('Exception while connecting Kafka')
print(str(ex))
finally:
return _producer
#Kafka Publish message function
#Function Definition
def publish_message(producer_instance, topic_name, key, value):
try:
key_bytes = bytes(key, encoding='utf-8')
value_bytes = bytes(value, encoding='utf-8')
producer_instance.send(topic_name, key=key_bytes, value=value_bytes)
producer_instance.flush()
print('Cab Booking Request published successfully.')
except Exception as ex:
print('Exception in publishing message')
print(str(ex))
#Producer and send messages for a topic
#Function Invocation
kafka_producer = connect_kafka_producer()
for i in range(1,10):
for j in range(1,10):
print(i)
print(j)
message = 'Publish to spark from kafka' + str(i) + ',' + str(j)
print(message)
publish_message(kafka_producer, 'cab_request', 'UberGo',message)
import os
import time
import sys
os.environ['PYSPARK_DRIVER_PYTHON'] = '/usr/bin/python3.6'
os.environ['PYSPARK_PYTHON'] = '/usr/bin/python3.6'
from pyspark import SparkContext, SparkConf
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
from pyspark.sql import SparkSession
from pyspark.streaming.kafka import TopicAndPartition
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.3.1 pyspark-shell'
conf = SparkConf().setAppName("Kafka-Spark")
sc = SparkContext.getOrCreate(conf)
#2 batches
stream=StreamingContext(sc,2)
print(sc.version)
kafkaBrokers = {"metadata.broker.list": "ip-XX-XX-XX-XX:9092"}
topic = "cab_request"
kafkastream = KafkaUtils.createDirectStream(stream, [topic],kafkaBrokers)
lines = kafkastream.map(lambda x: x[0])
messagedata = kafkastream.map(lambda x: x[1])
lines.pprint()
messagedata.pprint()
stream.start()
stream.awaitTermination()

Happy Learning!!!

April 01, 2019

Day #231 - Evaluating Existing Pytorch - ReId - Models

On Ubuntu System
  • Download Market 1501 Dataset - Link
  • Download Code from Link
  • Comment CUDA References 



  • Run the code



Happy Mastering DL!!!