February 23, 2019
February 21, 2019
Analysis of MIT Deep Learning Projects
I spent sometime to Analyze the MIT Deep Learning Projects. Very Inspiring. The healthcare projects are very inspiring. Broad categories and different domains. Good Read to know use cases and architecture.
Updated link
Happy Mastering DL!!!
Segmentation of Data Scientists
Data Scientists from stats world - This cluster has PhDs from the 2000s and working in Vision, Analytics since 2K period. Conversations with them were useful to handcraft features for image processing problems. They know the algos, basic math involved, intuitive details and the limitations of techniques.
Data Scientists with domain expertise - Laterals upskilled with data science skills. Data science practitioner world. Ability to bridge domain and Data Science use cases. Their Strength lies in identifying data, building the pipeline. Envisioning the end to end use flow.
Rookies - These days MOOC, Coursera, Udemy, Online Sessions, data science has a lot of visibility and attention for Entry level career choice. A lot of entry-level folks getting deeper into building models, getting good at model building, feature engineering
Kaggle Experts - The goto guys on feature engineering, parameter tuning, experimenting models, applying ensemble techniques, build models from anonymized data with the best accuracy
My journey has been through Databases, BI, Analytics. I use database primarily to data analysis, the perspective of BI helps to understand the Data from the business context, domain knowledge helps to quickly extract key data and quickly build models. All this experience helps to find use cases, building features for data models, build the data model, and sell it to business. I am still getting better in *selling part*. I keep learning with my interactions from all the segments of Data Scientists
Data Scientists with domain expertise - Laterals upskilled with data science skills. Data science practitioner world. Ability to bridge domain and Data Science use cases. Their Strength lies in identifying data, building the pipeline. Envisioning the end to end use flow.
Rookies - These days MOOC, Coursera, Udemy, Online Sessions, data science has a lot of visibility and attention for Entry level career choice. A lot of entry-level folks getting deeper into building models, getting good at model building, feature engineering
Kaggle Experts - The goto guys on feature engineering, parameter tuning, experimenting models, applying ensemble techniques, build models from anonymized data with the best accuracy
My journey has been through Databases, BI, Analytics. I use database primarily to data analysis, the perspective of BI helps to understand the Data from the business context, domain knowledge helps to quickly extract key data and quickly build models. All this experience helps to find use cases, building features for data models, build the data model, and sell it to business. I am still getting better in *selling part*. I keep learning with my interactions from all the segments of Data Scientists
Updated - 2022 - Feb 21
Labels:
Data Science,
Data Science Tips
February 19, 2019
February 18, 2019
Day #214 - Python Working with Arrays / Data Collection
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
February 17, 2019
Day #213 - Working with Sound and Python
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
February 14, 2019
Day #212 - OpenCV based Object Tracking Learning's
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips,
OpenCV
Voice Powered SQL Assistant
SQLBot - I am your Query Assistant what do you want me to do?
User - I want a query to join few tables
SQLBot - Tell the tables
User - Employee, Payment, JobDetails tables
SQLBot - Based on my analysis these are join columns EmployeeId for Employee-JobDetails, JobId for Job and Payment Table
User - Give me the query
SQLBot - There are four indexes available which indexes do you want me to use, any inputs
User - Give best possible query
SQLBot - I tried this query on 10K records it took 2.3 seconds, Is it fine? Do you want me to populate for 100K and try again?
User - I will do it in next sprint, Until then this is fine
SQLBot - Thank you, Small Stats - Other uses who worked on this similar query spent 40% more time analyzing than how you spent time
User - Time to go, bye, Check-in the code
User - I want a query to join few tables
SQLBot - Tell the tables
User - Employee, Payment, JobDetails tables
SQLBot - Based on my analysis these are join columns EmployeeId for Employee-JobDetails, JobId for Job and Payment Table
User - Give me the query
SQLBot - There are four indexes available which indexes do you want me to use, any inputs
User - Give best possible query
SQLBot - I tried this query on 10K records it took 2.3 seconds, Is it fine? Do you want me to populate for 100K and try again?
User - I will do it in next sprint, Until then this is fine
SQLBot - Thank you, Small Stats - Other uses who worked on this similar query spent 40% more time analyzing than how you spent time
User - Time to go, bye, Check-in the code
Use Technology to add value on top of human intelligence :)
Happy Learning!!!
Labels:
chatbot,
Data Science
February 13, 2019
Day #211- OpenCV based Optical Flow Example
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips,
OpenCV
February 12, 2019
Day #210 - NLP Coding Snippets
Samples on Entity Extraction, Keywords extraction, Sentiment Analysis for evaluating sentences.
Happy Mastering DL!!!!
Happy Mastering DL!!!!
Labels:
Data Science,
Data Science Tips
February 11, 2019
Day # 209 Pandas DateTime Coding Snippets
Lessons working on Pandas DateTime columns
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
Deep Life
Life is a form of reinforcement learning. I believe the growth-oriented mindset reflects reinforcement learning. Learn the lesson when you fail, re-apply the lesson when you succeed. Add a bit of randomness to evaluate newer unexplored territories. Keep Learning!!!#ArtificialIntelligence #DeepLearning #rl
Labels:
Data Science,
Data Science Tips
February 10, 2019
Day #208 - OpenAI - Spinning Up in Deep RL Workshop - Part 1
Key Lessons
Reinforcement Learning
Formulate RL Problem
Observations And Actions
Policy
Reward function - Measures of good / bad. More positive better
Value functions - How much reward expected to get
Value function satisfy Bellman Equation
Types of RL Algos
Model, Environment based models
Try - Evaluate - Improve the policy
Policy Optimzation
Happy Mastering DL!!!
- AGI - Artificial General Intelligence
- Do Most Economically Valuable work
- Deep Reinforcement Learning trains Deep Networks with Trial and Error
- Function approximators - Deep Networks
- Good for Sequential Learning
- Good when we do not know optimal behavior
- RL is useful when evaluating behavior is easier than generating them
Deep Learning
- Good for High Dimensional Data
- Approximate a function
- Video Games
- For Decision Rules
Recap of DL Patterns
- Finding a model that would give right output for certain inputs
- Output of each layer is a re-arrangement of input with the non-linearity applied
- Loss function differentialble with parameters in model
- Compute loss changes with respect to change in parameters
- Function composition is the core of the model
- Function topology with multiple architecture
- Non-Linearity does a lot of work
- Successive layers represents more complex features
- LSTM (RNN) - Accept timeseries of input and timeseries of output
- Transformer - Allows network to do (Attention) over several inputs
- Attention Neural Networks - Select most meaningful details from data, Make Decision based on lot of data
- Regularizers - Tradeoff loss against something that is not dependant on task, They do better job at Generalization
- Adaptive Optimizer
- Agent that interacts with environment
- Agent picks and executes action
- With New Environment Agent proceeds further
- What decision maximizes rewards
- Attains the goal with Trial and Error
Observations And Actions
- Observations are continuous
- Actions may be discrete or continuous
Policy
- Randomly (Stochastic)
- Deterministic (Map directly with no randomness)
- Randomness is helpful
- Logits (Probabilities of particular action)
- Probabilities of max of softmax of the logits
Reward function - Measures of good / bad. More positive better
Value functions - How much reward expected to get
Value function satisfy Bellman Equation
Types of RL Algos
Model, Environment based models
Try - Evaluate - Improve the policy
Policy Optimzation
- Run policy by complete trajectory
- Represent policy with Neural Network
- Parameters are in distribution
- Bring gradient inside expectation
- Expectation based on Trajectory
- Starting state drawn from some distribution
- Markov Property notion of picking next state depends on current state not on previous state
- Every action will get some update
- Reward to go Policy Gradient
- Advantage form functions
- How much better action is than average
- N-Step Advantage Estimates
- Initially Assignments (Weights) do Matter while setting up the system
Next is Part II
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
February 07, 2019
Startup Idea - Incentivised Learning for Kids - Paid Online Courses
Taking a cue from the Cabs, Food Delivery we can also apply the same approach for Online Classroom / Teaching kids / MOOC courses
Students, buy a course for X% use a certain portion of it to surprise them with desserts / good meal/toys or something that would motivate them on daily basis. If I complete this I might get an Icecream so I will do this. (A little bribe to read)
Students, buy a course for X% use a certain portion of it to surprise them with desserts / good meal/toys or something that would motivate them on daily basis. If I complete this I might get an Icecream so I will do this. (A little bribe to read)
Setting up Customer Base (Cabs) - OLA, Uber rides initially had heavy discounts for users, incentives for cab drivers. I remember applying coupons heavily and satisfaction of saving X% from every ride
Stabilization Phase - Since June 2018, I didn't get to see any more offers after they achieved user-base
Setting up Customer Base (Food Delivery) - Since October / November 2018 I again observed this trend of offers in Food Delivery. Heavily I have observed/used ubereats.
It's human psychology to go for offers / feel happy with the x% savings from the offer.
Learning (Incentivised Approach)
Taking the same idea and applying it for learning methodology. For every learning course, it can be evaluated based on
- Consistency in attending classes, Consistency of Learning
- Solving problems, Concept Understanding
- Explaining the concept in own terminology
- Using AI to evaluate theoretical/experimental aspects of knowledge
- Grade them against themselves, Instead of comparative grade provide individual progress time to time-based on collected data
- Based on progress provide incentive points
- These incentive points can be claimed with a special meal/toy something for that day
Call up and follow up when they are not taking up regularly. This personalized reminders also will give the motivation to continue the course. When the scheme of offers/promotions has a psychology impact at an individual level. The same concept for kids for the learning can be used to motivate them and provide more personalized incentives to make it more committed and encouragement for them.
Everything in life is connected with multiple aspects of actions, perspectives, thoughts. Hope this idea is implemented in learning apps to make it more encouraging for the kids. Give the perspective of marks as something they are good currently in this particular subject. Do not give the impression that good scores mean you know everything.
All these courses should aid in creating long term learning interests, consistent learning / creative thinking/experimenting mindset.
All these courses should aid in creating long term learning interests, consistent learning / creative thinking/experimenting mindset.
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
February 06, 2019
Day #207 - Dimensionality Reduction Notes
SVD - The sum of the squares of the singular values should be equal to the total variance in A
Matrix A, Can be expressed as
A = USVt
U,V - Orthogonal
U - Left Singular Vector
V - Right Singular Vector
A is an m × n matrix
U is an m × n orthogonal matrix
S is an n × n diagonal matrix
V is an n × n orthogonal matrix
Since an m × n matrix, where m > n, will have only n singular values, in SVD this is equivalent to solving an m × m matrix using only n singular values.
Reference - Link
Eigen Vectors
How Many Singular Values Should We Retain? - A useful rule of thumb is to retain enough singular values to make up 90% of the energy in Σ, Link
SVD - (Application in NLP) - Latent Semantic Analysis Notes
Happy Mastering DL!!!
Matrix A, Can be expressed as
A = USVt
U,V - Orthogonal
U - Left Singular Vector
V - Right Singular Vector
A is an m × n matrix
U is an m × n orthogonal matrix
S is an n × n diagonal matrix
V is an n × n orthogonal matrix
Since an m × n matrix, where m > n, will have only n singular values, in SVD this is equivalent to solving an m × m matrix using only n singular values.
- Dimensionality reduction is done by neglecting small singular values in the diagonal matrix S
- Feature of dimensionality reduction is only exploited in the decomposed version
Reference - Link
Eigen Vectors
- Satisfy AV(Vector) = L(Eigen Value)V(Eigen Vector)
- Certain Lines stretch don't change direction
- High Dimensional Data (Images, Text, Vector of Stock Data)
- Describe the data with only few values
How Many Singular Values Should We Retain? - A useful rule of thumb is to retain enough singular values to make up 90% of the energy in Σ, Link
SVD - (Application in NLP) - Latent Semantic Analysis Notes
- LSA applies singular value decomposition (SVD) to the matrix
- In SVD, a rectangular matrix is decomposed into the product of three other matrices
- One component matrix describes the original row entities as vectors of derived orthogonal factor values
- Another describes the original column entities in the same way
- Third is a diagonal matrix containing scaling values such that when the three components are matrix-multiplied, the original matrix is reconstructed
- The Dirichlet distribution takes a number (called alpha in most places) for each topic (or category)
Happy Mastering DL!!!
Labels:
Data Science,
Data Science Tips
February 05, 2019
Day#206 - AI for Social Cause
Few more ideas based on Recent Reads / Patterns
Key Lessons
Case #1 - Schedule Checkpoints and Patrols in Airport
Case #3 - Patrols using Graphs
Health
More Reads - http://teamcore.usc.edu/lecture.htm
IAAI Robert S. Engelmore Award Lecture: Milind Tambe (USC) | AI and Multiagent Systems for Social Good from AAAI Livestreaming on Vimeo.
Happy Mastering DL!!!
- Segment / Predict crime network based on telephone signals, vehicle movements, face recognition, card transaction, crime activity
- Auto-detect vehicle details from Video for violating Traffic Patterns
- Predict child leanings issues with AI attention, focus, writing, reading, interpretation and micro skills
- Spot early depression signs based on activity patterns
- Predict drought based on patterns
- Map missing children with Face similar Search global database, Predict child trafficking
- AI for post medication follow-up and drop out prediction
- AI for course study, drop out prediction and follow up
- Early Intervention to detect / prevent obesity /Diabetics
Key Lessons
- Direct Advances for Society Benefit
- Health, Safety and Wildlife Conservation
- Optimize resources
- Wildlife - Past poaching incident to predict
- Health - Homeless shelters, Influence Maximzation, Awareness of HIV, TB, Obesity, Health Challenges
Case #1 - Schedule Checkpoints and Patrols in Airport
- Game Theory for Security Resource Optimization
- Stackelberg Security Games Model
- Randomness / Deterministic
- Defender commits to randomized strategy
- Probability at different points at time
- These samples are used to generate schedule
- Randomized checkpoints, detections
Case #2 - Assign Air Marshals to Flights
- Assigning marshals to flights
- Support Size set is small
- Solve the Game Matrix
- Incremental Strategy generation
- Randomization of Scheduling
- Different ways of patrol boat movements
- Optimize and Schedule for patrol
Conservation / Wildlife
- Snairs and Traps to kill animals
- Divide into Grid Squares
- Mixed Integer to generate patrols
- Multiple bounded rational poachers
- Learn response based on past poaching data
- Finding a missing item in a grid cell
- Ensemble of classifiers to predict
- Classify into high risk / low risk areas based on historical data
- Strategically Signalling
- Optimal Deceptive Signalling
Health
- Awareness to reduce rates of HIV
- Peer Leader Campaign
- Peer Leaders
- Homeless Shelters
Prevent TB in India
- Low resource community
- Non-adherence of TB treatment
- Digital Adherence tracking technology
- Calling and reminding
- From Calling pattern predict adherence
- Everwell
- Predict High-Risk using SVM / RF Algo
- Mixed Strategy (Randomization with multiple predictors)
- Decision Focused Learning
Prevent Suicides
IAAI Robert S. Engelmore Award Lecture: Milind Tambe (USC) | AI and Multiagent Systems for Social Good from AAAI Livestreaming on Vimeo.
Happy Mastering DL!!!
Subscribe to:
Posts (Atom)