"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

August 31, 2019

Day #273 - Learnings from Program Errors - face_recognition

Lesson #1 - axis 1 is out of bounds for array of dimension 1
Working on Face Comparisons, Ended up error using face_recognition.compare_faces method.

This link was useful https://github.com/ageitgey/face_recognition/issues/328

bad one: face_recognition.compare_faces(encoding,person_encoding):
good one: face_recognition.compare_faces([encoding],person_encoding)

Lesson #2 - “RuntimeError: dictionary changed size during iteration” error?
Reason - Modified the dictionary while iterating the elements
Fix - Fixed it by modifying it outside the loop

A lot of simple design choices, end to end thinking keeps changing while building the solution

Datasets - Link

Happy Learning!!!

August 30, 2019

Day #272 - Clustering to find Data Insights

For different kinds of data, we need to pick the right columns to find the right insights. Most of them can be picked up with domain knowledge, Clustering perspective analysis.
  • For Sales Data, Clustering intent was to find sales insights. Data Sources were CustomerId, NumberOfOrders, TotalOrderValue. Clustered the same to find order value buckets. High Value, Medium, Low Value.
  • For Loss in Retail Store, Clustering insights to find loss patterns. Data Sources were SkuId, LossCount, LossValue. Clustered them to High Loss, Medium, Low Loss buckets.
This helps to address the key loss items and focus on proactive measures to prevent further loss. After a long time picked up R. Forgot execution command Ctrl+A, Ctrl+Enter. Every editor, tools, language have their own patterns, formats of coding.

This is the same as 'Recency, Frequency and Monetary value'. I came to know about this today (7/6/2020). Sometimes what you already implemented might be already done in some pattern :)

Recency, Frequency, Monetary Model with Python — and how Sephora uses it to optimize their Google and Facebook Ads
Find Your Best Customers with Customer Segmentation in Python
Introduction to Customer Segmentation in Python

Happy Learning!!!

August 29, 2019

Day #271 - FAISS Experiments

Examples on Basic indexes, Binary indexes, Composite indexes in FAISS
  • Exact Results - FlatIndex - index that can guarantee exact results. Types - IndexFlatL2 or IndexFlatIP
  • Somewhat Match - Clustering, then store in buckets "Flat"
  • conda Install on ubuntu. Version that works is - conda install faiss-cpu=1.5.1 -c pytorch -y
  • A similar project for store, fetch data - https://github.com/waltyou/faiss-web-service 
#Example #1
#database_size has to be minium 100, Number of training points (10) should be at least as large as number of clusters (100)
#Sample code to check on dataset, query and find neighbours
import numpy as np
dimensions = 128
database_size = 1000
queries = 1
np.random.seed(1234)
generateddata = np.random.random((database_size,dimensions)).astype('float32')
queriesdata = np.random.random((queries,dimensions)).astype('float32')
#n,d = generateddata.shape
#print(n)
#print(d)
import faiss
#nlist is number of division units
nlist = 100
#Number of neighbours to check
k = 4
quantizer = faiss.IndexFlatL2(dimensions)
index = faiss.IndexIVFFlat(quantizer,dimensions,nlist,faiss.METRIC_L2)
#Need both train & add method, without it errors
index.train(generateddata)
index.add(generateddata)
scores, neighbours = index.search(queriesdata,k)
print(scores)
print(neighbours)
for numbers in neighbours:
for number in numbers:
print(number)
print(generateddata[number])
#Example 2
#Ref - https://gist.github.com/mdouze/8e47d8a5f28280df7de7841f8d77048d
import numpy as np
import faiss
d = 32
numberofqueries = 1
numberofdataentries = 1000
dimensions = 32
querydata = np.random.rand(numberofqueries,32).astype('float32')
dataentries = np.random.rand(numberofdataentries,32).astype('float32')
index = faiss.IndexFlatL2(dimensions)
index.add(dataentries)
Dref, Iref = index.search(querydata,10)
print(Dref)
print(Iref)
#search knn
view raw faissexample.py hosted with ❤ by GitHub
More Reads Link1, Link2

Happy Learning!!!

August 28, 2019

Day #270 - Installing FAISS on Ubuntu


#1. Download project https://github.com/facebookresearch/faiss
#2. Install faiss - conda install -c pytorch faiss-cpu (Details - https://github.com/facebookresearch/faiss/blob/master/INSTALL.md )
#. Crash happens in current version. Filed issue in https://github.com/facebookresearch/faiss/issues/928
#3. To Fix - conda install faiss-cpu=1.5.1 -c pytorch -y
#Demo Code
import sys
import faiss
import numpy as np
index = faiss.IndexFlatL2(64)
print(index.is_trained)
Happy Learning!!!

August 26, 2019

Day #269 - Age and Gender Estimation from Face


#pip install py-agender
#https://github.com/aristofun/py-agender
#datasets - http://www.cvpapers.com/datasets.html
import cv2
print(cv2.__version__)
from pyagender import PyAgender
agender = PyAgender()
faces = agender.detect_genders_ages(cv2.imread(r'E:\\face_144.jpg'))
#gender > 0.5 == female
print(faces)
for data in faces:
for key in data:
if(key=='gender' or key == 'age'):
if key=='gender':
if data[key] <0.5:
print(key)
print('Male')
else:
print(key)
print('Female')
if key=='age':
print(key)
print(data[key])
Happy Learning!!!

August 18, 2019

Completing one Training session of Deep Learning

Today I completed my 10th class of Deep Learning. It was fantastic learning, discussion with the participants. I am very thankful for the students for healthy discussions and sessions.

Learning never ends. The consolidated feedback looks like below
  • Feedbacks - 25
  • Excellent  -16
  • Good - 4
  • Average -2
  • Bad -3
Keep Learning. Keep Going!!!!

Day #268 - Techniques for unique vehicle identification

  • Finding with cars passing the mid-point (Approach #1 – Experimented with SQL)
  • Take the bounding boxes between two frames and check the similarity between the matching regions. Between the two frames and the matching regions how much does the content match. This can be used to remove duplicate cars
  • Generate a feature vector for the Area of Intersection and validate with the frame if it is similar. 
Advanced methods
  • KNN approach – Experiment KNN approach with the data points to classify a new or existing object
  • LSTM based tracking – Model the frames into sequences to classify if it is a new object or existing object. It is possible, Needs a bit more analysis
  • Siamese network for Similarity of Objects 
Happy Learning!!!

August 14, 2019

Day #267 - Get Historical Weather Data

Fetch Historical weather data information to correlate with findings


#pip install wwo-hist
#https://www.worldweatheronline.com/developer/
from wwo_hist import retrieve_hist_data
#fetch for every day (24 hours Window)
frequency=24
start_date = '01-JANUARY-2019'
end_date = '10-AUGUST-2019'
api_key = 'XYZABC'
location_list = ['new-york']
hist_weather_data = retrieve_hist_data(api_key,location_list,start_date,end_date,frequency,location_label = False,export_csv = True, store_df = True)
Happy Learning!!!

August 07, 2019

Day #266 - Face Recognition Examples - Package - face_recognition

Use Cases
1. Extract Face Region
2. Search for Face Match



Deep face recognition with Keras, Dlib and OpenCV

Key Research Papers - Face Recognition

#pip install face_recognition
#https://github.com/zeynepCankara/Face-Recognition-Tensorflow/blob/master/Face_Rec_System.ipynb
import os
import face_recognition
import cv2
#Get All Images in directory
images = os.listdir(r'E:\\FaceFeatures\Test')
print(images)
#Extract Area of Face
def ExtractFaceRegion(imagefilepath):
image = cv2.imread(imagefilepath)
# Convert it from BGR to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect face in the image and get its location (square boxes coordinates)
boxes = face_recognition.face_locations(image, model='hog')
print(type(boxes))
top = boxes[0][0]
right = boxes[0][1]
bottom = boxes[0][2]
left = boxes[0][3]
# draw the predicted face name on the image
cv2.rectangle(image, (left, top), (right, bottom), (0, 255, 0), 2)
y= top - 15 if top - 15 > 15 else top + 15
cv2.putText(image, "Result", (left, y), cv2.FONT_ITALIC,0.75, (0, 255, 0), 2)
cv2.imshow("Highlighted Image", image)
cv2.waitKey(0)
crop = image[top:bottom, left:right]
cv2.imshow("Cropped Image", crop)
cv2.waitKey(0)
encoding = face_recognition.face_encodings(image, boxes)
print(encoding)
return
def SearchforMatch(imagefilepath):
imagesearch = face_recognition.load_image_file(imagefilepath)
imagesearchencoded = face_recognition.face_encodings(imagesearch)[0]
for imagename in images:
current_image = face_recognition.load_image_file(r'E:\\FaceFeatures\Test\\'+imagename)
current_image_encoded = face_recognition.face_encodings(current_image)[0]
result = face_recognition.compare_faces([imagesearchencoded],current_image_encoded)
if result[0]==True:
print("Matched:"+imagename)
else:
print("Not Matched:"+imagename)
return
imagefilepath = r'E:\\FaceFeatures\Test\1.jpg'
ExtractFaceRegion(imagefilepath)
SearchforMatch(imagefilepath)
cv2.destroyAllWindows()
#Example 2
#pip install mtcnn
#https://github.com/ipazc/mtcnn
from mtcnn import MTCNN
import cv2
img = cv2.cvtColor(cv2.imread(r"E:\Siva_Learning_Blog_Posts\Face1.jpg"),cv2.COLOR_BGR2RGB)
detector = MTCNN()
print(detector.detect_faces(img))
result = detector.detect_faces(img)
bounding_box = result[0]['box']
keypoints = result[0]['keypoints']
cv2.rectangle(img,
(bounding_box[0], bounding_box[1]),
(bounding_box[0]+bounding_box[2], bounding_box[1] + bounding_box[3]),
(0,155,255),
2)
cv2.circle(img,(keypoints['left_eye']), 2, (0,155,255), 2)
cv2.circle(img,(keypoints['right_eye']), 2, (0,155,255), 2)
cv2.circle(img,(keypoints['nose']), 2, (0,155,255), 2)
cv2.circle(img,(keypoints['mouth_left']), 2, (0,155,255), 2)
cv2.circle(img,(keypoints['mouth_right']), 2, (0,155,255), 2)
cv2.imwrite(r'E:\Siva_Learning_Blog_Posts\Result.jpg', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
print(result)

Happy Learning!!!

August 06, 2019

Thinking Efficiency

Metrics to measure and improve your thinking

Number of creative ideas
Number of newer perspectives 
Number of simple solutions
Number of similar ideas analyzed
Communicate and get feedback on ideas
The complete end to end clarity of implementation 
The visualization of implementation in the end-user perspective
Explore and build incrementally

Happy Learning and Teaching!!!

Day #265 - Python Arrays & Version Check of TF,OpenCV


import tensorflow as tf
# 25 data points
# Each point with 32 random elements in it
z = tf.random.normal([5,5,1,32])
#5 x 5 matrix
#Each 1 x 32
sess = tf.Session()
result = sess.run(z)
print(result.shape)
z = tf.random.normal([2,2,1,3])
sess = tf.Session()
result = sess.run(z)
print(result.shape)
print(result)
import numpy as np
#Reshape 100 elements into 10 batches of 2 x 5
a = np.arange(100).reshape((-1,10,2,5))
print(a)
#Reshape 100 elements into 5 batches of 1 x 20
a = np.arange(100).reshape((-1,5,1,20))
print(a)
#Reshape 100 elements into 20 batches of 1 x 5
a = np.arange(100).reshape((-1,20,1,5))
print(a)
#Reshape 100 elements into 20 batches of 1 x 5
a = np.arange(100).reshape((-1,20,1,5))
b = a.reshape(-1,100)
print(b)
#Versions check - Python
import sys
print("python environment " + str(sys.version_info))
#OpenCV version
import cv2
print("OpenCV version " + str(cv2.__version__))
#Tensorflow version
import tensorflow as tf
print(str(tf.VERSION))
#pip freeze
Happy Learning!!!

August 02, 2019

Probability Notes and Concepts


  • Prior distribution that incorporates your subjective beliefs about a parameter  
  • Posterior is the result with data
  • Likelihood - Based on the posterior the how ‘likely’ is the data is going to occur
  • Bayesian Linear Regression - The response, y, is not estimated as a single value, but is assumed to be drawn from a probability distribution. Determine the posterior distribution for the model parameters
  • Latent variables as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) 
  • A Bernoulli random variable has two possible outcomes: 0 or 1. A binomial distribution is the sum of independent and identically distributed Bernoulli random variables.
  • Poisson distribution - events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event
  • Negative binomial distribution describes the number of successes k until observing r failures
Happy Learning!!!

August 01, 2019

Day #263 - Reviewing Interesting Deep Learning Projects

For days I haven't been able to catch up on my technical debts. Too many things running in mind. I feel my time is getting reduced and the things to accomplish are more. Hoping to priotitize and be productive coming days

A ton of projects listed in cs231 reports. This is a great way to understand applications. Reviewing a few of them every day

Project #1 - ConvNets for Intelligent Baby Monitoring
Poster - Link 

This was my dream product idea. A year back we were discussing on the same. They have done a fantastic job in analyzing the idea end to end.

Summary 
Verified in different architecture - ResNet18[1], AlexNet[10]and SqueezeNet[6]. Different camera position was evaluated

Challenges
  • Highly unbalanced dataset
  • Different camera viewpoints, different rooms/cribs, different babies, different lighting conditions and different toys inside the crib
Motion detection techniques
  • Image subtraction
  • Optical flow
  • Temporal difference method
Manually classify into 5 classes
  • Caretaker
  • Empty Crib
  • Sitting Up
  • Lying Down 
  • Standing Up
Great Idea and well-implemented!!!

Happy Learning!!!