Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): February 2019

February 23, 2019

XGBoost on Windows, Python 3

	#Install XGboost on windows, Python 3.2

	#Step 1
	anaconda search -t conda xgboost

	#Step2
	conda install -c mikesilva xgboost

	#Step 3
	pip install xgboost

	from xgboost import XGBClassifie
	classifier = XGBClassifier()
	classifier.fit(x1,y1)

	#https://towardsdatascience.com/fine-tuning-xgboost-in-python-like-a-boss-b4543ed8b1e

view raw xgboost_windows.py hosted with ❤ by GitHub

Happy Mastering DL!!!

XGBoost Part 1 (of 4): Regression

February 21, 2019

SVD Summary

Recommendations

Happy Learning!!!

Analysis of MIT Deep Learning Projects

I spent sometime to Analyze the MIT Deep Learning Projects. Very Inspiring. The healthcare projects are very inspiring. Broad categories and different domains. Good Read to know use cases and architecture.

Updated link

Happy Mastering DL!!!

Segmentation of Data Scientists

Data Scientists from stats world - This cluster has PhDs from the 2000s and working in Vision, Analytics since 2K period. Conversations with them were useful to handcraft features for image processing problems. They know the algos, basic math involved, intuitive details and the limitations of techniques.

Data Scientists with domain expertise - Laterals upskilled with data science skills. Data science practitioner world. Ability to bridge domain and Data Science use cases. Their Strength lies in identifying data, building the pipeline. Envisioning the end to end use flow.

Rookies - These days MOOC, Coursera, Udemy, Online Sessions, data science has a lot of visibility and attention for Entry level career choice. A lot of entry-level folks getting deeper into building models, getting good at model building, feature engineering

Kaggle Experts - The goto guys on feature engineering, parameter tuning, experimenting models, applying ensemble techniques, build models from anonymized data with the best accuracy

My journey has been through Databases, BI, Analytics. I use database primarily to data analysis, the perspective of BI helps to understand the Data from the business context, domain knowledge helps to quickly extract key data and quickly build models. All this experience helps to find use cases, building features for data models, build the data model, and sell it to business. I am still getting better in *selling part*. I keep learning with my interactions from all the segments of Data Scientists

Updated - 2022 - Feb 21

Ref - Link

Happy Mastering DL!!!

February 19, 2019

Day #215 - Deep Dive OpenCV

	#cv2.getStructuringElement
	#===============================
	#Generate Different Kernel Combinations

	kernel1 = cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
	kernel2 = cv2.getStructuringElement(cv2.MORPH_ELLIPLSE,(5,5))
	kernel3 = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))

	#cv2.filter2D
	#==============
	#OpenCV provides a function cv2.filter2D() to convolve a kernel with an image

	#Example1
	import cv2
	import numpy as np
	original_image = cv2.imread(r'E:\Opencv_Examples\goalkeeper.jpg')
	kernel1 = cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
	kernel2 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
	kernel3 = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
	conv1 = cv2.filter2D(original_image ,-1,kernel1)
	conv2 = cv2.filter2D(original_image ,-1,kernel2)
	conv3 = cv2.filter2D(original_image ,-1,kernel3)
	cv2.imshow('conv1',conv1)
	cv2.imshow('conv2',conv2)
	cv2.imshow('conv3',conv3)
	cv2.waitKey(0)
	cv2.destroyAllWindows()

	#cv2.calcHist
	#=============
	#cv2.calcHist() function to find the histogram
	#cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])
	#channels - grayscale image [0]. [0], [1] or [2] to calculate histogram of blue, green or red channel
	#mask : mask image. To find histogram of full image, it is given as "None"
	#histSize : this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256].
	#ranges : this is our RANGE. Normally, it is [0,256].

	##Example2
	import cv2
	original_image = cv2.imread(r'E:\Opencv_Examples\goalkeeper.jpg',cv2.IMREAD_GRAYSCALE)
	hist = cv2.calcHist(original_image,[0],None,[256],[0,256])

	original_image = cv2.imread(r'E:\Opencv_Examples\goalkeeper.jpg',cv2.IMREAD_COLOR)
	hist_r = cv2.calcHist(original_image,[0],None,[256],[0,256])
	hist_g = cv2.calcHist(original_image,[1],None,[256],[0,256])
	hist_b = cv2.calcHist(original_image,[2],None,[256],[0,256])

	from matplotlib import pyplot as plt
	plt.hist(hist_b, bins=10)
	plt.ylabel('Values')
	plt.show()

	#cv2.threshold
	#===============
	#If pixel value is greater than a threshold value, it is assigned one value (may be white)
	#assigned another value (may be black)
	#Minot changes from https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html
	import cv2
	import numpy as np
	from matplotlib import pyplot as plt
	img = cv2.imread(r'E:\Opencv_Examples\goalkeeper.jpg',cv2.IMREAD_GRAYSCALE)
	ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
	ret,thresh2 = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
	ret,thresh3 = cv2.threshold(img,127,255,cv2.THRESH_TRUNC)
	ret,thresh4 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO)
	ret,thresh5 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO_INV)
	titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
	images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]
	for i in range(6):
	plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
	plt.title(titles[i])
	plt.xticks([]),plt.yticks([])
	plt.show()

	#cv2.calcBackProject
	#===================
	#cv2.calcBackProject(). Its parameters are almost same as the cv2.calcHist()
	#object histogram should be normalized before passing on to the backproject function
	#One of input is histogram of object we want to find it

	#cv2.merge
	#=========
	import cv2
	img = cv2.imread(r'E:\Opencv_Examples\goalkeeper.jpg',cv2.IMREAD_COLOR)
	b,g,r = cv2.split(img)
	cv2.imshow('b',b)
	cv2.imshow('g',g)
	cv2.imshow('r',r)
	img2 = cv2.merge((b,g,g))
	cv2.imshow('bgg',img2)
	img3 = cv2.merge((b,g,r))
	cv2.imshow('bgr',img3)
	cv2.waitKey(0)
	cv2.destroyAllWindows()

	#cv2.bitwise_and
	#===============
	#This includes bitwise AND, OR, NOT and XOR operations.
	#They will be highly useful while extracting any part of the image
	#cv2.bitwise_and
	#cv2.bitwise_not
	#https://docs.opencv.org/3.2.0/d0/d86/tutorial_py_image_arithmetics.html

view raw OpencvExamples.py hosted with ❤ by GitHub

Happy Mastering DL!!!

February 18, 2019

Day #214 - Python Working with Arrays / Data Collection

	#Working with arrays
	import numpy as np
	a = np.array([[1,2,3],[3,4,5]])
	print('a')
	print(a)
	#Reshape as three rows 2 columns
	b = a.reshape(3,2)
	print('b')
	print(b)
	#Reshape as two rows and three columns
	c = a.reshape(2,3)
	print('c')
	print(c)
	#Transpose
	d = a.T
	print('d')
	print(d)
	#Can only specify one unknown dimension
	#unknown dimension for rows but two columns
	e = a.reshape(-1,2)
	print('e')
	print(e)
	#2 rows one unknown dimension for columns
	f = a.reshape(2,-1)
	print('f')
	print(f)
	#parse the array
	i = 0
	print('print values of a')
	for row in a:
	for value in row:
	print('position',str(i))
	i = i+1
	print(value)

view raw arrays.py hosted with ❤ by GitHub

	#list = []
	#tuple = ()
	#sets = {}
	#dictionary = {}

	sportslist = ['cricket','hockey','tennis','badminton']
	vegetablestuple = ('brinjal','tomato','beetroot','drumstick')
	foodmenuset = {'biryani','roti','rice','curd'}
	dicthotels = {'india':'Delhi','china':'beijing','srilanka':'colombo','pakistan':'islamabad'}
	dicthotelscities = {'india':['Delhi','chennai','mumbai'],'china':['beijing','shangai']}

	print('sportslist')
	print('_______________')
	for name in sportslist:
	print(name)

	print('vegetablestuple')
	print('_______________')
	for name in vegetablestuple:
	print(name)

	print('dicthotels')
	print('_______________')
	for key,value in dicthotels.items():
	print(key)
	print(value)

	print('dicthotelscities')
	print('_______________')
	for key,value in dicthotelscities.items():
	print(key)
	for city in value:
	print(city)

view raw datacollection.py hosted with ❤ by GitHub

Happy Mastering DL!!!

February 17, 2019

Day #213 - Working with Sound and Python

	#pip install librosa
	import librosa
	import matplotlib.pyplot as plt
	import librosa.display
	import numpy as np
	path = r'D:\PetProject\Audio_Analytics\UrbanSound.tar\UrbanSound\data\air_conditioner\204240.wav'
	#waveform `y`
	#Store the sampling rate as `sr`
	#By default, this uses resampy’s high-quality mode (‘kaiser_best’).
	y,sr = librosa.load(path,res_type='kaiser_fast')
	plt.figure(figsize=(10,5))
	librosa.display.waveplot(y,sr=sr)

	#Mel frequency cepstral coefficients (MFCCs)
	#small set of features
	mfcc = librosa.feature.mfcc(y=y, sr=sr)
	print(mfcc.shape)

	#numpy.ndarray of size (n_mfcc, T)
	mfccs=np.mean(librosa.feature.mfcc(y=y,sr=sr,n_mfcc=40).T,axis=0)
	print(mfccs)
	print(mfccs.shape)

	#Load 5 seconds of a wav file, starting 15 seconds in
	y,sr = librosa.load(path,res_type='kaiser_fast', offset=15.0, duration=5.0)
	plt.figure(figsize=(10,5))
	librosa.display.waveplot(y,sr=sr)

	#Load a wav file and resample to 11 KHz
	y,sr = librosa.load(path,res_type='kaiser_fast',sr=11025)
	plt.figure(figsize=(10,5))
	librosa.display.waveplot(y,sr=sr)

view raw sound.py hosted with ❤ by GitHub

Happy Mastering DL!!!

February 14, 2019

Day #212 - OpenCV based Object Tracking Learning's

	#https://docs.opencv.org/3.1.0/db/df8/tutorial_py_meanshift.html
	#Yolo + Meanshift - OpenCV
	#Write comment for each line that you don't understand :)
	import numpy as np
	import cv2
	cap = cv2.VideoCapture(r'E:\Optical_Flow\slow.flv')
	ret, frame = cap.read()
	#frame = cv2.resize(old_frame, (500, 400))

	#set up initial location window
	r,h,c,w = 250,90,400,125 #assign it based on Yolo
	track_window = (c,r,w,h)

	#set up roi for tracking
	roi = frame[r:r+h,c:c+w]

	#Converts an image from one color space to another.
	hsv_roi = cv2.cvtColor(roi,cv2.COLOR_BGR2HSV)

	#The cv2.inRange - three arguments
	#first is the image to perform color detection
	#second - lower limit of the color you want to detect
	#third argument - upper limit of the color you want to detect.
	mask = cv2.inRange(hsv_roi,np.array((0.,60.,32.)),np.array((180.,255.,255.)))

	#cv2.calcHist to calculate the histogram of an image
	#cv2.calcHist(images, channels, mask, bins, ranges)
	#For grayscale images as there's only one channel and [0], [1] or [2]
	#bins - is a list containing the number of bins to use for each channel
	#ranges - is the range of the possibile pixel values which is [0, 256] in case of RGB color space (where 256 is not inclusive).
	roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])

	#its normalized values will be R/S, G/S and B/S (where, S=R+G+B).
	#when normType=NORM_MINMAX (for dense arrays only).
	#The optional mask specifies a sub-array to be normalized.
	#This means that the norm or min-n-max are calculated over the sub-array, and then this sub-array is modified to be normalized
	cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

	#Set up Termination Criteria either 10 iteration or move by atlease 1 pt
	term_crit = (cv2.TERM_CRITERIA_EPS \| cv2.TERM_CRITERIA_COUNT,10,1)

	while(1):
	ret, frame = cap.read()
	if ret == True:
	hsv = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
	#Calculates the back projection of a histogram.
	#images, channels, hist, ranges, scale[, dst]
	dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

	#apply mean shift to get the new location
	#move that window to the area of maximum pixel density (or maximum number of points)
	#the movement is reflected in histogram backprojected image.
	#As a result, meanshift algorithm moves our window to the new location with maximum density
	ret, track_window = cv2.meanShift(dst,track_window,term_crit)

	#draw it on image00000000
	x,y,w,h = track_window
	img2 = cv2.rectangle(frame,(x,y),(x+w,y+h),255,2)
	cv2.imshow('img2',img2)

	k = cv2.waitKey(0) & 0xff
	if k==27:
	break
	else:
	break
	cv2.destroyAllWindows()
	cap.release()

view raw py_meanshift.py hosted with ❤ by GitHub

Happy Mastering DL!!!

Voice Powered SQL Assistant

SQLBot - I am your Query Assistant what do you want me to do?
User - I want a query to join few tables

SQLBot - Tell the tables
User - Employee, Payment, JobDetails tables

SQLBot - Based on my analysis these are join columns EmployeeId for Employee-JobDetails, JobId for Job and Payment Table
User - Give me the query

SQLBot - There are four indexes available which indexes do you want me to use, any inputs
User - Give best possible query

SQLBot - I tried this query on 10K records it took 2.3 seconds, Is it fine? Do you want me to populate for 100K and try again?
User - I will do it in next sprint, Until then this is fine

SQLBot - Thank you, Small Stats - Other uses who worked on this similar query spent 40% more time analyzing than how you spent time
User - Time to go, bye, Check-in the code

Use Technology to add value on top of human intelligence :)

Happy Learning!!!

February 13, 2019

Day #211- OpenCV based Optical Flow Example

	#Modified and updated opencv example based on my requirements
	#https://docs.opencv.org/3.1.0/d7/d8b/tutorial_py_lucas_kanade.html
	import numpy as np
	import cv2
	cap = cv2.VideoCapture(r'E:\Optical_Flow\Demo.mp4')

	#params for shitomasi corner detection
	feature_params = dict(maxCorners=100,qualityLevel=0.3,minDistance=7,blockSize=7)

	#parameters for lucas kanade optical flow
	lk_params = dict(winSize=(15,15),maxLevel=2,criteria=(cv2.TERM_CRITERIA_EPS\|cv2.TERM_CRITERIA_COUNT,10,0.03))

	#Create some random colors
	color = np.random.randint(0,255,(100,3))

	#Take first frame and find corners in it
	ret, old_frame = cap.read()
	image = cv2.resize(old_frame, (500, 400))
	old_gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
	p0 = cv2.goodFeaturesToTrack(old_gray,mask=None,**feature_params)

	#create mask for drawing purpose
	mask = np.zeros_like(image)
	final_frame = np.zeros_like(image)
	while(1):
	flag, frame1 = cap.read()
	if flag:
	frame = cv2.resize(frame1, (500, 400))
	frame_gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

	#calculate optical flow
	p1,st,err = cv2.calcOpticalFlowPyrLK(old_gray,frame_gray,p0,None,**lk_params)

	#select good points
	good_new = p1[st==1]
	good_old = p0[st==1]

	print(good_new)
	print(good_old)

	#draw the tracks
	for i,(new,old) in enumerate(zip(good_new,good_old)):
	a,b = new.ravel()
	c,d = old.ravel()
	mask = cv2.line(mask,(a,b),(c,d),color[i].tolist(),2)
	frame = cv2.circle(frame,(a,b),5,color[i].tolist())
	img = cv2.add(frame,mask)
	cv2.imshow('Intermediate frame',img)
	final_frame = img.copy()
	k = cv2.waitKey(30) & 0xff
	if k==27:
	break
	#now update previous frame and previous points
	old_gray = frame_gray.copy()
	p0 = good_new.reshape(-1,1,2)
	else:
	break
	cv2.destroyAllWindows()

	#Hough Transformation
	#Line Count
	gray = cv2.cvtColor(final_frame,cv2.COLOR_BGR2GRAY)
	img_gaussian = cv2.GaussianBlur(gray,(3,3),0)
	img_sobelx = cv2.Sobel(img_gaussian,cv2.CV_8U,1,0,ksize=5)
	img_sobely = cv2.Sobel(img_gaussian,cv2.CV_8U,0,1,ksize=5)
	#Compute lines using Hough Transformation
	xlines = cv2.HoughLines(img_sobelx,1,np.pi/180,200)
	ylines = cv2.HoughLines(img_sobely,1,np.pi/180,200)
	if(ylines is not None):
	if(xlines is not None):
	print('sobel - HoughLines')
	c = ylines.size/xlines.size
	print(c)
	print('X Line Counts')
	print(xlines)
	print('Y Line Counts')
	print(ylines)
	cv2.imshow("Final Frame",final_frame)
	cv2.waitKey(0)
	cv2.destroyAllWindows()
	cap.release()

view raw Opticalflow_example.py hosted with ❤ by GitHub

Happy Mastering DL!!!

February 12, 2019

Day #210 - NLP Coding Snippets

Samples on Entity Extraction, Keywords extraction, Sentiment Analysis for evaluating sentences.

	#pip install spacy
	#python -m spacy download en
	#pip install multi-rake

	from multi_rake import Rake
	import spacy
	nlp = spacy.load('en')
	from textblob import TextBlob

	def computekeywords(sentence):
	print(sentence)
	doc = nlp(sentence)
	rake = Rake()
	print('using spacy')
	for ent in doc.ents:
	print(ent.text, ent.label_)
	keywords = rake.apply(sentence)
	print('Keywords using Rake')
	print(keywords)
	print('Sentiment of the sentence')
	analysis = TextBlob(sentence)
	if(analysis.sentiment[0])>0:
	intent = 'Positive'
	elif(analysis.sentiment[0])<0:
	intent = 'Negative'
	else:
	intent = 'Neutral'
	print(intent)

	sentence = "I need car insurance"
	computekeywords(sentence)
	sentence2 = "I lost my credit card"
	computekeywords(sentence2)

	#I need car insurance
	#using spacy
	#Keywords using Rake
	#[('car insurance', 4.0)]
	#Sentiment of the sentence
	#Neutral
	#I lost my credit card
	#using spacy
	#Keywords using Rake
	#[('credit card', 4.0), ('lost', 1.0)]
	#Sentiment of the sentence
	#Neutral

view raw NLP_Keywords.py hosted with ❤ by GitHub

Happy Mastering DL!!!!

February 11, 2019

Day # 209 Pandas DateTime Coding Snippets

Lessons working on Pandas DateTime columns

	import numpy as np
	from dateutil.parser import parse
	a = parse('2012-11-01 02:48:30')
	print(a)
	import time
	print(int(time.mktime(a.timetuple())))

	import pandas as pd
	data = [['Alex','2012-11-01 02:48:30'],['Bob','2011-11-01 02:48:30'],['Clarke','2013-11-01 02:48:30']]
	#Create a dataframe
	df = pd.DataFrame(data,columns=['Name','EmploymentDate'])
	#convert str to date time
	df['Date_time1'] = pd.to_datetime(df['EmploymentDate'])
	#convert date time to int value
	df['date_time_int'] = df.Date_time1.astype(np.int64)
	print(df)

view raw dates_example.py hosted with ❤ by GitHub

Happy Mastering DL!!!

Deep Life

Life is a form of reinforcement learning. I believe the growth-oriented mindset reflects reinforcement learning. Learn the lesson when you fail, re-apply the lesson when you succeed. Add a bit of randomness to evaluate newer unexplored territories. Keep Learning!!!#ArtificialIntelligence #DeepLearning #rl

February 10, 2019

Day #208 - OpenAI - Spinning Up in Deep RL Workshop - Part 1

Key Lessons

AGI - Artificial General Intelligence
Do Most Economically Valuable work
Deep Reinforcement Learning trains Deep Networks with Trial and Error
Function approximators - Deep Networks

Reinforcement Learning

Good for Sequential Learning
Good when we do not know optimal behavior
RL is useful when evaluating behavior is easier than generating them

Deep Learning

Good for High Dimensional Data
Approximate a function

Deep RL

Video Games
For Decision Rules

Recap of DL Patterns

Finding a model that would give right output for certain inputs
Output of each layer is a re-arrangement of input with the non-linearity applied
Loss function differentialble with parameters in model
Compute loss changes with respect to change in parameters
Function composition is the core of the model
Function topology with multiple architecture
Non-Linearity does a lot of work
Successive layers represents more complex features
LSTM (RNN) - Accept timeseries of input and timeseries of output
Transformer - Allows network to do (Attention) over several inputs
Attention Neural Networks - Select most meaningful details from data, Make Decision based on lot of data
Regularizers - Tradeoff loss against something that is not dependant on task, They do better job at Generalization
Adaptive Optimizer

Formulate RL Problem

Agent that interacts with environment
Agent picks and executes action
With New Environment Agent proceeds further
What decision maximizes rewards
Attains the goal with Trial and Error

Observations And Actions

Observations are continuous
Actions may be discrete or continuous

Policy

Randomly (Stochastic)
Deterministic (Map directly with no randomness)

Randomness is helpful

Logits (Probabilities of particular action)
Probabilities of max of softmax of the logits

Trajectory - Sequence of states and actions in an environment

Reward function - Measures of good / bad. More positive better

Value functions - How much reward expected to get
Value function satisfy Bellman Equation

Types of RL Algos
Model, Environment based models

Try - Evaluate - Improve the policy
Policy Optimzation

Run policy by complete trajectory
Represent policy with Neural Network

Derive Policy Gradient

Parameters are in distribution
Bring gradient inside expectation
Expectation based on Trajectory

Starting state drawn from some distribution
Markov Property notion of picking next state depends on current state not on previous state

Every action will get some update
Reward to go Policy Gradient

Advantage form functions
How much better action is than average

N-Step Advantage Estimates

Initially Assignments (Weights) do Matter while setting up the system

Next is Part II

Happy Mastering DL!!!

February 07, 2019

Startup Idea - Incentivised Learning for Kids - Paid Online Courses

Taking a cue from the Cabs, Food Delivery we can also apply the same approach for Online Classroom / Teaching kids / MOOC courses

Students, buy a course for X% use a certain portion of it to surprise them with desserts / good meal/toys or something that would motivate them on daily basis. If I complete this I might get an Icecream so I will do this. (A little bribe to read)

Setting up Customer Base (Cabs) - OLA, Uber rides initially had heavy discounts for users, incentives for cab drivers. I remember applying coupons heavily and satisfaction of saving X% from every ride

Stabilization Phase - Since June 2018, I didn't get to see any more offers after they achieved user-base

Setting up Customer Base (Food Delivery) - Since October / November 2018 I again observed this trend of offers in Food Delivery. Heavily I have observed/used ubereats.

It's human psychology to go for offers / feel happy with the x% savings from the offer.

Learning (Incentivised Approach)

Taking the same idea and applying it for learning methodology. For every learning course, it can be evaluated based on

Consistency in attending classes, Consistency of Learning
Solving problems, Concept Understanding
Explaining the concept in own terminology
Using AI to evaluate theoretical/experimental aspects of knowledge
Grade them against themselves, Instead of comparative grade provide individual progress time to time-based on collected data
Based on progress provide incentive points
These incentive points can be claimed with a special meal/toy something for that day

Call up and follow up when they are not taking up regularly. This personalized reminders also will give the motivation to continue the course. When the scheme of offers/promotions has a psychology impact at an individual level. The same concept for kids for the learning can be used to motivate them and provide more personalized incentives to make it more committed and encouragement for them.

Everything in life is connected with multiple aspects of actions, perspectives, thoughts. Hope this idea is implemented in learning apps to make it more encouraging for the kids. Give the perspective of marks as something they are good currently in this particular subject. Do not give the impression that good scores mean you know everything.

All these courses should aid in creating long term learning interests, consistent learning / creative thinking/experimenting mindset.

Happy Mastering DL!!!

February 06, 2019

Day #207 - Dimensionality Reduction Notes

SVD - The sum of the squares of the singular values should be equal to the total variance in A

Matrix A, Can be expressed as
A = USVt

U,V - Orthogonal
U - Left Singular Vector
V - Right Singular Vector

A is an m × n matrix
U is an m × n orthogonal matrix
S is an n × n diagonal matrix
V is an n × n orthogonal matrix

Since an m × n matrix, where m > n, will have only n singular values, in SVD this is equivalent to solving an m × m matrix using only n singular values.

Dimensionality reduction is done by neglecting small singular values in the diagonal matrix S
Feature of dimensionality reduction is only exploited in the decomposed version

Output - Storing the truncated forms of U, S, and V in place of A

Reference - Link

Eigen Vectors

Satisfy AV(Vector) = L(Eigen Value)V(Eigen Vector)
Certain Lines stretch don't change direction

Linear Dimensionality Reduction (PCA, SVD)

High Dimensional Data (Images, Text, Vector of Stock Data)
Describe the data with only few values

	#https://gist.github.com/addisonhuddy/8a9e682259c9dca1f61672b4027863dc
	import numpy as np
	a = np.array([[1,1,1,0,2],[2,1,3,5,0],[1,3,5,6,2],[1,3,5,6,9],[2,3,4,5,6]])

	#set printing options
	np.set_printoptions(suppress=True)
	np.set_printoptions(precision=3)

	print('FULL')
	U,S,Vt = np.linalg.svd(a,full_matrices=True)

	print('U')
	print(U)
	print('S')
	print(S)
	print('Vt')
	print(Vt)

	print('Reduced - Ignore small values')
	U,S,Vt = np.linalg.svd(a,full_matrices=False)

	print('U')
	print(U)
	print('S')
	print(S)
	print('Vt')
	print(Vt)

	from sklearn.decomposition import PCA
	from sklearn.decomposition import TruncatedSVD

	pca = PCA(n_components=2)
	pca.fit(a)
	a_transformed = pca.transform(a)
	print('pca')
	print(a_transformed)
	print(pca.explained_variance_)

view raw svd.py hosted with ❤ by GitHub

How Many Singular Values Should We Retain? - A useful rule of thumb is to retain enough singular values to make up 90% of the energy in Σ, Link

SVD - (Application in NLP) - Latent Semantic Analysis Notes

LSA applies singular value decomposition (SVD) to the matrix
In SVD, a rectangular matrix is decomposed into the product of three other matrices
One component matrix describes the original row entities as vectors of derived orthogonal factor values
Another describes the original column entities in the same way
Third is a diagonal matrix containing scaling values such that when the three components are matrix-multiplied, the original matrix is reconstructed

LDA

The Dirichlet distribution takes a number (called alpha in most places) for each topic (or category)

February 05, 2019

Day#206 - AI for Social Cause

Few more ideas based on Recent Reads / Patterns

Segment / Predict crime network based on telephone signals, vehicle movements, face recognition, card transaction, crime activity
Auto-detect vehicle details from Video for violating Traffic Patterns
Predict child leanings issues with AI attention, focus, writing, reading, interpretation and micro skills
Spot early depression signs based on activity patterns
Predict drought based on patterns
Map missing children with Face similar Search global database, Predict child trafficking
AI for post medication follow-up and drop out prediction
AI for course study, drop out prediction and follow up
Early Intervention to detect / prevent obesity /Diabetics

AI for Social Cause, AI for better humanity - Found this talk interesting
Key Lessons

Direct Advances for Society Benefit
Health, Safety and Wildlife Conservation
Optimize resources
Wildlife - Past poaching incident to predict
Health - Homeless shelters, Influence Maximzation, Awareness of HIV, TB, Obesity, Health Challenges

Safety and Security
Case #1 - Schedule Checkpoints and Patrols in Airport