Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): May 2019

May 30, 2019

Day #255 - Analytics Use Cases

Datarobot has a good list of use cases listed. On top of it, I wanted to add on the ML models, data comments.

Ref - Link

Happy Mastering DL!!!

May 27, 2019

Day #254 - Upgrade OpenVino on Linux, Install System Studio

1. Follow steps in link

Few other Custom Additional Steps
rename existing /opt/intel to /opt/intel_bkp
mv intel intel_bkp
sudo mv old_dir new_dir_name
sudo rm -r directory_to_remove

File to set path of OpenVino
vi /home/ubuntu/.bashrc

pip install protobuf==3.6.1
pip install test-generator==0.1.1

Download form Link

System Studio Download Link

cd ~/Downloads/
mkdir intel_tools
unzip intel-sq-tools-installation-bundle-linux-linux.zip -d intel_tools

tar -xvzf system_studio.tar.gz

cd system_studio
sudo ./install.sh

5.1GB download

Post Installation - Run this script:
/opt/intel/system_studio_2019/iss_ide_eclipse-launcher.sh

First Project - Link

GC++ Hello World App

Run a Custom Model in Python

Happy Mastering DL!!!

May 22, 2019

Day #253 - My Backprop Notes

This post I was looking for sometime. Backprop for me in one Slide :)

Question #4 - The Bias (5 marks): We generally initialize the bias to random numbers larger than 0. Why? What happens if we initialize it to a value below zero? Does this affect our ability to train?

Answer -

We cannot initialize it to zero. By chain rule it will affect the derivatives and will end up in zeros only. By assigning random non-zero variables we will have derivatives available and slowly find the local maxima using gradient descent approach.

Happy Mastering DL!!!

May 21, 2019

Day #252 - Data Science Skills

Data Science = Database + Insight + Business Acumen + Feature Engineering + Build your model + Deploy at Scale

Database = Load, Query, Aggregate, Find, Max, Mins, Group by different Dimensions
Insight = Convert the numbers of highs and Lows into Why Insights questions
Business Acumen = Finding the right use case that balances data availability and business expectations
Feature Engineering = Convert all your Insight Skills into feature variables
Build your model = Build the model and evaluate it
Productionization = Recommend Spark / Whatever options scalable to deploy it

These are my personal lessons working on projects. Look at data science from a big picture. Learn and master every skill continuously. Learning never ends.

Happy Mastering Data Science!!!

May 17, 2019

Day #251 - Customer Churn Modelling in Simple terms

Scenario

Customer A buys 2 Beers, 1 bottle Whiskey every Wednesday of Week
If he does not buy on Wednesday he will come on Friday to pick up 2 Beers, 1 bottle Whiskey
If the customer is out of town he would not buy

With ML let see

If customer usually comes within minimum 3 days and maximum 5 days
If customer does not turn up after 5 days we need to find reason why
Is there some other shop nearby where customer prefers to buy
Does the custom buy the same thing online with discounts

We infer and quickly observe if there is a change in pattern to see if we are losing the customer. This is what churn modelling in simple terms.

Happy Mastering Data Science!!!

Day #250 - Context and Preferences aware recommendations

Its is very important to have a feedback loop to provide better recommendations
Basic Recommendation

Buy Product A
Usual ML recommendation is bought together A,B,C. Sold together A,C

This is great everyone will do it. If I dislike the product A, How do we know
Feedback based Recommendation

Buy Product A
Read Custom Low Rating for A
Look for people who don't like A what they bought
Recommend those instead of A

Returns based Recommendation

Customer Buys Product A
Custom returns Product A
Look for people who don't like A what they bought
Recommend those instead of A

Season based Recommendation
Considering customer preferences for each season, brands picked up for each season, Consider them into recommendations

Brand based Recommendations
Customer affinity for a certain brand because of size, variety etc

Happy Building Your Data Story!!!

May 14, 2019

Day #249 - Porting Base faster_rcnn_inception_v2_coco_2018_01_28 Model to OpenVino

This model was downloaded from link - http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
Openvino says this model is supported - This link contains models supported. https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html#inpage-nav-2-1
Execute command in Python Console -

C:\Intel\openvino_2019.1.133\deployment_tools\model_optimizer\python mo_tf.py --data_type=FP32 --tensorflow_object_detection_api_pipeline_config "C:\faster_rcnn_inception_v2_coco_2018_01_28\pipeline.config" --tensorflow_use_custom_operations_config "C:\Intel\openvino_2019.1.133\deployment_tools\model_optimizer\extensions\front\tf\faster_rcnn_support.json" --input_model "C:\frozen_inference_graph.pb"

Porting in OpenVino format is successful, However, I did encounter other issues in custom training. Will share learning's after finding a solution.

Happy Mastering DL!!!

Redirect output to log file Windows

> File_to_capture_logs Setting to redirect both console and error logs (2>&1)

Example
Test.bat > E:\Runout_log.txt 2>&1

Happy Learning!!!

May 11, 2019

Use AI with Caution

AI can predict but no replacement for Good Customer Service
Feature engineering identifying insights from historical data and translating into meaningful features
Data imbalance can be handled by upsampling, downsampling, data augmentation but remember good data plus basic model can beat data imbalance

Happy Learning!!!

May 10, 2019

Day #248 - Custom Yolo Detection Lessons on Windows

Two versions of Yolo, darkflow-master vs Darknet. Darknet - written in C and CUDA, Darkflow - YOLO on TensorFlow
Darknet Annotations .txt format
Darkflow Annotations xml format
After Custom Training Error in model - #IOError: [Errno 2] No such file or directory: 'labels.txt'
Edit file in path of D:\darkflow-master\darkflow-master\darkflow\defaults.py - Provide full paths of the file location for labels, checkpoint

How to train YOLOv3 on Google COLAB to detect custom objects (e.g: Gun detection)

Happy Mastering DL!!!

Day #247 - Yolo Compile Error - LINK : fatal error LNK1158: cannot run 'rc.exe'

While trying to build Yolo encountered Error - Yolo Compile Error - LINK : fatal error LNK1158: cannot run 'rc.exe'

The missing executables and files copying from C:\Program Files (x86)\Windows Kits\8.1\bin\x86

Fixed the issue. Below two files

Useful StackOverflow answer

Happy Mastering DL!!!

May 09, 2019

Day #246 - Updating OpenVino Latest Version Install - Windows

Build date - 24 Apr 2019

This was quick as all pre-requisites were done earlier

1. Extract and Install to C:\Intel
2. Open Anaconda Shell and activated environment console
3. Run the C:\Intel\openvino_2019.1.133\bin\setupvars.bat
4. Run All install pre-requisites in folder - C:\Intel\openvino_2019.1.133\deployment_tools\model_optimizer\install_prerequisites
I tried - install_prerequisites.bat, install_prerequisites_tf.bat

5. Goto Demo Folder C:\Intel\openvino_2019.1.133\deployment_tools\demo and Run demo_squeezenet_download_convert_run.bat

Happy Mastering DL!!!

May 06, 2019

AI - Key Papers - Timelines

Object detection (Redmon et al, 2015)
Image captioning - Neural Image Caption Generation with Visual Attention, 2015
Random walks in latent space - (Alex Radford, 2015)
Semantic segmentation (Long et al, 2015)
Auto-captioning (2015)
Autonomous cars (NVIDIA, 2016)
Future simulation - (Finn et al, 2016)
Neural machine translation - (Google’s Neural Machine Translation System, 2016)
Drug design and response prediction (Gomez-Bombarelli et al, 2016)
Impersonation by encoding-decoding an unknown face. - (Kamil Czarnogórski, 2016)
Image super-resolution - (Ledig et al, 2016)
Reinforcement learning (Mnih et al, 2014)
Segmentation (Hengshuang et al, 2017)
Pose estimation (Cao et al, 2017)
Music composition (NVIDIA, 2017)
Geometric matching (Rocco et al, 2017)
Instance segmentation (He et al, 2017)
Scene understanding - (Wu et al, 2017)
Transfer learning from synthetic to real images - (Inoue et al, 2017)
Strategy games (Deepmind, 2016-2018)
Speech synthesis and question answering (Google, 2018)
Image generation (Karras et al, 2018)
Real-time object detection (Redmon and Farhadi, 2018)
Sequence Problems

Sequence classification - Sentiment Analysis, Activity Recognition, DNA Sequence classification, action selection
Sequence Synthesis - Text Synthesis, Music Synthesis, Motion Synthesis
Sequence to Sequence Translation - Speech Recognition, Text Translation, POS tagging
Generative models - Image and content generation - DRAW: A Recurrent Neural Network For Image Generation, Pizel RNN, ALI

Happy Mastering DL!!!

Data story of Taxi Booking apps

This data story is my personal experience using Taxi Booking apps. I use both Ola / Uber. Some of the common observations. I have tried to outline the data flow, Reporting use cases, ML use cases involved based on my understanding and usage.

Observations using App

On booking cab request we can see
Vehicle type and expected time
SLA to reach the destination
Real-time message processing, notifying, accepting and notifying rider and driver partner (stream processing, segmentation, notification, acceptances)
Display stats of driver during the trip

Pain points Observed

You book a trip at x price. The trip gets canceled by the driver. Now when you book again peak price is applied
My personal experience drivers more comfortable with cash payments
Reluctant to switch on AC by mileage conscious drivers
Target driven. I have spoken to driver partners driving non stop 24hrs to meet targets
I never had a great experience using cab pooling. It took me 2x time most cases unless if it is an odd time

Data collected

Trip details
Passenger details
Fare details
Ratings of driver and passenger
Cab bookings at each location point
Find maximum long routes, maximum booking points location
Find Maximum booking time across airports, bus stop, Railway stations
Driver partner ride and earning details
Data available at the city level, Area level (Slide / Dice)
Review / Rating / Feedback on Cancellation

Data at customer Level

Trip Details by Each Customer. Expenditure at the customer level
Since location is shared they can identify Office, Home, Restaurants, Malls, Airports, Railway stations

ML use cases for customers / Booking

Segment people using services based on trip distance, number of trips, trip expense
Classify people in terms of potential weekend travelers, shopper, Stay at home person
Recommend areas for peak pricing
Recommend timing for peak pricing
Recommend peak pricing with the highest probability of conversion (A/B testing)
Predict top 10 cab pickup points and order numbers considering historical data seasonality
Predict customer churn
Promotions based on segmenting customers (High Value, Medium, Low Spending Customers)
A lot of scope vision apps to do audio based analytics, classic drowsiness detection, distraction, use of the mobile phone ( custom object detection models)
NLP on Customer feedback / Sentiment Analysis

ML use cases for driver partner

Predict driver churn
Predict the number of trips for next week and set target accordingly
Predict the nearest area where the probability of booking higher for driver partner
Predict Acceptance Rate for a Route based on Driver preferences derived from historical data

Promotions

Promotions and recommendations for eateries
Promotion for a pass for customers

Data collected from the vehicle (If it is fitted with sensors to collect data) - Car Manufacturers and Ride Sharing App Partnerships - 'Data Access' to understand

Access to data which can be used to build predictive models, deep learning models for training Autonomous driving decisions
Real-time data pipeline for sensors, devices, software, vision data for building models customized for Indian Conditions
Access to Components Utilization patterns for different vehicles running in different Regions / State
All this data will help in building Connected Cars, Training better models for better Data-Driven Decisions
Driving conditions vs vehicle performance in those road conditions

Other Factors / Emerging Competitors

Quick ride has come up, which is also sharing the same space of ride-sharing apps but for a different segment. Quick ride is more economical, predictable with recurring rides.

Customers, Driver partners would have an android based smartphone. Google has all the information available to give a cab-sharing app like a social platform. If Google is going to monetize for sharing traffic details, congestion then it will also get significant revenue for the provider

Autonomous vehicles - Robo taxis is a distant dream for our country. If such a thing happens I am afraid about an alternate career for driver partners. Change is the only permanent thing that never changes

Updated May 28/2020

In an end-to-end IoT-enabled transportation ecosystem, the information would flow seamlessly throughout the network creating an information value loop. Source @DeloitteInsight Link https://t.co/vPV62egjU0 via @antgrasso @antgrasso_IT #IoT #IIoT #ecosystem #DigitalStrategy pic.twitter.com/cqUC7TOYYB
— Tech to Specialists (@Tech2Specialist) May 28, 2020

I have tried to outline certain data stories I observed using Taxi Booking apps. Your comments and feedback welcome!!!.

May 30, 2019

May 27, 2019

May 22, 2019

May 21, 2019

May 17, 2019

May 14, 2019

May 11, 2019

May 10, 2019

May 09, 2019

May 06, 2019

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts