Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): February 2019

February 23, 2019

XGBoost on Windows, Python 3

Happy Mastering DL!!!

XGBoost Part 1 (of 4): Regression

February 21, 2019

SVD Summary

Recommendations

Happy Learning!!!

Analysis of MIT Deep Learning Projects

I spent sometime to Analyze the MIT Deep Learning Projects. Very Inspiring. The healthcare projects are very inspiring. Broad categories and different domains. Good Read to know use cases and architecture.

Updated link

Happy Mastering DL!!!

Segmentation of Data Scientists

Data Scientists from stats world - This cluster has PhDs from the 2000s and working in Vision, Analytics since 2K period. Conversations with them were useful to handcraft features for image processing problems. They know the algos, basic math involved, intuitive details and the limitations of techniques.

Data Scientists with domain expertise - Laterals upskilled with data science skills. Data science practitioner world. Ability to bridge domain and Data Science use cases. Their Strength lies in identifying data, building the pipeline. Envisioning the end to end use flow.

Rookies - These days MOOC, Coursera, Udemy, Online Sessions, data science has a lot of visibility and attention for Entry level career choice. A lot of entry-level folks getting deeper into building models, getting good at model building, feature engineering

Kaggle Experts - The goto guys on feature engineering, parameter tuning, experimenting models, applying ensemble techniques, build models from anonymized data with the best accuracy

My journey has been through Databases, BI, Analytics. I use database primarily to data analysis, the perspective of BI helps to understand the Data from the business context, domain knowledge helps to quickly extract key data and quickly build models. All this experience helps to find use cases, building features for data models, build the data model, and sell it to business. I am still getting better in *selling part*. I keep learning with my interactions from all the segments of Data Scientists

Updated - 2022 - Feb 21

Ref - Link

Happy Mastering DL!!!

February 19, 2019

Day #215 - Deep Dive OpenCV

Happy Mastering DL!!!

February 18, 2019

Day #214 - Python Working with Arrays / Data Collection

Happy Mastering DL!!!

February 17, 2019

Day #213 - Working with Sound and Python

Happy Mastering DL!!!

February 14, 2019

Day #212 - OpenCV based Object Tracking Learning's

Happy Mastering DL!!!

Voice Powered SQL Assistant

SQLBot - I am your Query Assistant what do you want me to do?
User - I want a query to join few tables

SQLBot - Tell the tables
User - Employee, Payment, JobDetails tables

SQLBot - Based on my analysis these are join columns EmployeeId for Employee-JobDetails, JobId for Job and Payment Table
User - Give me the query

SQLBot - There are four indexes available which indexes do you want me to use, any inputs
User - Give best possible query

SQLBot - I tried this query on 10K records it took 2.3 seconds, Is it fine? Do you want me to populate for 100K and try again?
User - I will do it in next sprint, Until then this is fine

SQLBot - Thank you, Small Stats - Other uses who worked on this similar query spent 40% more time analyzing than how you spent time
User - Time to go, bye, Check-in the code

Use Technology to add value on top of human intelligence :)

Happy Learning!!!

February 13, 2019

Day #211- OpenCV based Optical Flow Example

Happy Mastering DL!!!

February 12, 2019

Day #210 - NLP Coding Snippets

Samples on Entity Extraction, Keywords extraction, Sentiment Analysis for evaluating sentences.

Happy Mastering DL!!!!

February 11, 2019

Day # 209 Pandas DateTime Coding Snippets

Lessons working on Pandas DateTime columns

Happy Mastering DL!!!

Deep Life

Life is a form of reinforcement learning. I believe the growth-oriented mindset reflects reinforcement learning. Learn the lesson when you fail, re-apply the lesson when you succeed. Add a bit of randomness to evaluate newer unexplored territories. Keep Learning!!!#ArtificialIntelligence #DeepLearning #rl

February 10, 2019

Day #208 - OpenAI - Spinning Up in Deep RL Workshop - Part 1

Key Lessons

AGI - Artificial General Intelligence
Do Most Economically Valuable work
Deep Reinforcement Learning trains Deep Networks with Trial and Error
Function approximators - Deep Networks

Reinforcement Learning

Good for Sequential Learning
Good when we do not know optimal behavior
RL is useful when evaluating behavior is easier than generating them

Deep Learning

Good for High Dimensional Data
Approximate a function

Deep RL

Video Games
For Decision Rules

Recap of DL Patterns

Finding a model that would give right output for certain inputs
Output of each layer is a re-arrangement of input with the non-linearity applied
Loss function differentialble with parameters in model
Compute loss changes with respect to change in parameters
Function composition is the core of the model
Function topology with multiple architecture
Non-Linearity does a lot of work
Successive layers represents more complex features
LSTM (RNN) - Accept timeseries of input and timeseries of output
Transformer - Allows network to do (Attention) over several inputs
Attention Neural Networks - Select most meaningful details from data, Make Decision based on lot of data
Regularizers - Tradeoff loss against something that is not dependant on task, They do better job at Generalization
Adaptive Optimizer

Formulate RL Problem

Agent that interacts with environment
Agent picks and executes action
With New Environment Agent proceeds further
What decision maximizes rewards
Attains the goal with Trial and Error

Observations And Actions

Observations are continuous
Actions may be discrete or continuous

Policy

Randomly (Stochastic)
Deterministic (Map directly with no randomness)

Randomness is helpful

Logits (Probabilities of particular action)
Probabilities of max of softmax of the logits

Trajectory - Sequence of states and actions in an environment

Reward function - Measures of good / bad. More positive better

Value functions - How much reward expected to get
Value function satisfy Bellman Equation

Types of RL Algos
Model, Environment based models

Try - Evaluate - Improve the policy
Policy Optimzation

Run policy by complete trajectory
Represent policy with Neural Network

Derive Policy Gradient

Parameters are in distribution
Bring gradient inside expectation
Expectation based on Trajectory

Starting state drawn from some distribution
Markov Property notion of picking next state depends on current state not on previous state

Every action will get some update
Reward to go Policy Gradient

Advantage form functions
How much better action is than average

N-Step Advantage Estimates

Initially Assignments (Weights) do Matter while setting up the system

Next is Part II

Happy Mastering DL!!!

February 07, 2019

Startup Idea - Incentivised Learning for Kids - Paid Online Courses

Taking a cue from the Cabs, Food Delivery we can also apply the same approach for Online Classroom / Teaching kids / MOOC courses

Students, buy a course for X% use a certain portion of it to surprise them with desserts / good meal/toys or something that would motivate them on daily basis. If I complete this I might get an Icecream so I will do this. (A little bribe to read)

Setting up Customer Base (Cabs) - OLA, Uber rides initially had heavy discounts for users, incentives for cab drivers. I remember applying coupons heavily and satisfaction of saving X% from every ride

Stabilization Phase - Since June 2018, I didn't get to see any more offers after they achieved user-base

Setting up Customer Base (Food Delivery) - Since October / November 2018 I again observed this trend of offers in Food Delivery. Heavily I have observed/used ubereats.

It's human psychology to go for offers / feel happy with the x% savings from the offer.

Learning (Incentivised Approach)

Taking the same idea and applying it for learning methodology. For every learning course, it can be evaluated based on

Consistency in attending classes, Consistency of Learning
Solving problems, Concept Understanding
Explaining the concept in own terminology
Using AI to evaluate theoretical/experimental aspects of knowledge
Grade them against themselves, Instead of comparative grade provide individual progress time to time-based on collected data
Based on progress provide incentive points
These incentive points can be claimed with a special meal/toy something for that day

Call up and follow up when they are not taking up regularly. This personalized reminders also will give the motivation to continue the course. When the scheme of offers/promotions has a psychology impact at an individual level. The same concept for kids for the learning can be used to motivate them and provide more personalized incentives to make it more committed and encouragement for them.

Everything in life is connected with multiple aspects of actions, perspectives, thoughts. Hope this idea is implemented in learning apps to make it more encouraging for the kids. Give the perspective of marks as something they are good currently in this particular subject. Do not give the impression that good scores mean you know everything.

All these courses should aid in creating long term learning interests, consistent learning / creative thinking/experimenting mindset.

Happy Mastering DL!!!

February 06, 2019

Day #207 - Dimensionality Reduction Notes

SVD - The sum of the squares of the singular values should be equal to the total variance in A

Matrix A, Can be expressed as
A = USVt

U,V - Orthogonal
U - Left Singular Vector
V - Right Singular Vector

A is an m × n matrix
U is an m × n orthogonal matrix
S is an n × n diagonal matrix
V is an n × n orthogonal matrix

Since an m × n matrix, where m > n, will have only n singular values, in SVD this is equivalent to solving an m × m matrix using only n singular values.

Dimensionality reduction is done by neglecting small singular values in the diagonal matrix S
Feature of dimensionality reduction is only exploited in the decomposed version

Output - Storing the truncated forms of U, S, and V in place of A

Reference - Link

Eigen Vectors

Satisfy AV(Vector) = L(Eigen Value)V(Eigen Vector)
Certain Lines stretch don't change direction

Linear Dimensionality Reduction (PCA, SVD)

High Dimensional Data (Images, Text, Vector of Stock Data)
Describe the data with only few values

How Many Singular Values Should We Retain? - A useful rule of thumb is to retain enough singular values to make up 90% of the energy in Σ, Link

SVD - (Application in NLP) - Latent Semantic Analysis Notes

LSA applies singular value decomposition (SVD) to the matrix
In SVD, a rectangular matrix is decomposed into the product of three other matrices
One component matrix describes the original row entities as vectors of derived orthogonal factor values
Another describes the original column entities in the same way
Third is a diagonal matrix containing scaling values such that when the three components are matrix-multiplied, the original matrix is reconstructed

LDA

The Dirichlet distribution takes a number (called alpha in most places) for each topic (or category)

February 05, 2019

Day#206 - AI for Social Cause

Few more ideas based on Recent Reads / Patterns

Segment / Predict crime network based on telephone signals, vehicle movements, face recognition, card transaction, crime activity
Auto-detect vehicle details from Video for violating Traffic Patterns
Predict child leanings issues with AI attention, focus, writing, reading, interpretation and micro skills
Spot early depression signs based on activity patterns
Predict drought based on patterns
Map missing children with Face similar Search global database, Predict child trafficking
AI for post medication follow-up and drop out prediction
AI for course study, drop out prediction and follow up
Early Intervention to detect / prevent obesity /Diabetics

AI for Social Cause, AI for better humanity - Found this talk interesting
Key Lessons

Direct Advances for Society Benefit
Health, Safety and Wildlife Conservation
Optimize resources
Wildlife - Past poaching incident to predict
Health - Homeless shelters, Influence Maximzation, Awareness of HIV, TB, Obesity, Health Challenges

Safety and Security
Case #1 - Schedule Checkpoints and Patrols in Airport