Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): May 2018

May 31, 2018

Day #111 - OpenCV3 Feature Matching

On Windows platform performed the following installations

pip install opencv-python
pip install opencv-contrib-python

	#pip install opencv-python --upgrade
	#pip install opencv-contrib-python --upgrade
	#pip show opencv-python
	#pip show opencv-contrib-python
	#pip install opencv-contrib-python==3.3.0.9

	import numpy as np
	import cv2
	import matplotlib.pyplot as plt
	img1 = cv2.imread('B4.jpg',0) # queryImage
	img2 = cv2.imread('SKU.jpg',0) # trainImage

	sift = cv2.xfeatures2d.SIFT_create()

	# find the keypoints and descriptors with SIFT
	kp1, des1 = sift.detectAndCompute(img1,None)
	kp2, des2 = sift.detectAndCompute(img2,None)

	# BFMatcher with default params
	bf = cv2.BFMatcher()
	matches = bf.knnMatch(des1,des2, k=2)
	# Apply ratio test
	good = []
	for m,n in matches:
	if m.distance < 0.7*n.distance:
	good.append([m])

	img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2)
	plt.imshow(img3),plt.show()

	print('Number of matches - SIFT')
	print(len(good))

	# FLANN parameters
	FLANN_INDEX_KDTREE = 1
	index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
	search_params = dict(checks=100) # or pass empty dictionary
	flann = cv2.FlannBasedMatcher(index_params,search_params)
	matches = flann.knnMatch(des1,des2,k=2)

	# Need to draw only good matches, so create a mask
	matchesMask = [[0,0] for i in range(len(matches))]

	# ratio test as per Lowe's paper
	for i,(m,n) in enumerate(matches):
	if m.distance < 0.7*n.distance:
	matchesMask[i]=[1,0]

	draw_params = dict(matchColor = (0,255,0),
	singlePointColor = (255,0,0),
	matchesMask = matchesMask,
	flags = 0)

	print('Number of matches - FLANN')
	print(len(draw_params))

	img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,matches,None,**draw_params)
	#plt.imshow(img3,),plt.show()

	# Initiate ORB detector
	orb = cv2.ORB_create()

	# find the keypoints and descriptors with ORB
	kp1, des1 = orb.detectAndCompute(img1,None)
	kp2, des2 = orb.detectAndCompute(img2,None)

	# create BFMatcher object
	bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
	# Match descriptors.
	matches = bf.match(des1,des2)

	# Sort them in the order of their distance.
	matches = sorted(matches, key = lambda x:x.distance)

	print('Number of matches - ORB')
	print(len(matches))

	#Modified Original code with Minor Changes
	#https://docs.opencv.org/3.3.0/dc/dc3/tutorial_py_matcher.html

view raw OpebCV3_FeatureMatching.py hosted with ❤ by GitHub

Happy Learning!!!

May 30, 2018

Day #110 - Image Processing - Line Counting from Images

Learning's from recent exposure to working on images, texture and identifying the line count in vertical and horizontal axis. OpenCV was useful to arrive at different approaches

	from __future__ import division
	import cv2
	import numpy as np
	import os

	#Initalize directory to images
	path_to_folder = "E:\\New folder"

	#Initialize different filesets
	path_to_file_approach1 = "E:\\Approach1.csv"
	path_to_file_approach2 = "E:\\Approach2.csv"
	path_to_file_approach3 = "E:\\Approach3.csv"
	path_to_file_approach4 = "E:\\Approach4.csv"


	#Initialize output file for each approach
	def InitializeOutputfile(ApproachCode):
	if(ApproachCode == '01'):
	filehandle = open(path_to_file_approach1,'w')
	return filehandle
	if(ApproachCode == '02'):
	filehandle = open(path_to_file_approach2,'w')
	return filehandle
	if(ApproachCode == '03'):
	filehandle = open(path_to_file_approach3,'w')
	return filehandle
	if(ApproachCode == '04'):
	filehandle = open(path_to_file_approach4,'w')
	return filehandle

	#Compute EPI DPI
	def ComputeDPI(ApproachCode, filehandle):
	files = []
	for i in os.listdir(path_to_folder):
	files.append(i)
	print(i)
	if(ApproachCode == '01'):
	EP_DPI = Approach1(i)
	print(EP_DPI)
	filehandle.write(i + "," + str(EP_DPI) + "\n")
	if(ApproachCode == '02'):
	EP_DPI = Approach2(i)
	print(EP_DPI)
	filehandle.write(i + "," + str(EP_DPI) + "\n")
	if(ApproachCode == '03'):
	EP_DPI = Approach3(i)
	print(EP_DPI)
	filehandle.write(i + "," + str(EP_DPI) + "\n")
	if(ApproachCode == '04'):
	EP_DPI = Approach4(i)
	print(EP_DPI)
	filehandle.write(i + "," + str(EP_DPI) + "\n")
	filehandle.close()

	def Approach1(img):
	c = 0.00
	img = cv2.imread(path_to_folder + "\\" + img)
	gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
	img_gaussian1 = cv2.GaussianBlur(gray,(3,3),0)
	img_gaussian = img_gaussian1[100:400, 100:400]
	img_sobelx = cv2.Sobel(img_gaussian,cv2.CV_8U,1,0,ksize=5)
	img_sobely = cv2.Sobel(img_gaussian,cv2.CV_8U,0,1,ksize=5)
	#Compute lines using Hough Transformation
	xlines = cv2.HoughLines(img_sobelx,1,np.pi/180,200)
	ylines = cv2.HoughLines(img_sobely,1,np.pi/180,200)
	if(ylines is not None):
	if(xlines is not None):
	print('sobel - HoughLines')
	c = ylines.size/xlines.size
	return c


	def Approach2(img):
	c = 0.00
	img = cv2.imread(path_to_folder + "\\" + img)
	gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
	img_gaussian1 = cv2.GaussianBlur(gray,(3,3),0)
	img_gaussian = img_gaussian1[100:400, 100:400]
	img_sobelx = cv2.Sobel(img_gaussian,cv2.CV_8U,1,0,ksize=5)
	img_sobely = cv2.Sobel(img_gaussian,cv2.CV_8U,0,1,ksize=5)

	#Erosion Followed by Dialiton
	kernel = np.ones((5,5),np.uint8)
	sobelErosionDialotionx = cv2.morphologyEx(img_sobelx, cv2.MORPH_OPEN, kernel)
	sobelErosionDialotiony = cv2.morphologyEx(img_sobely, cv2.MORPH_OPEN, kernel)

	#Compute lines using Hough Transformation
	xlines = cv2.HoughLines(sobelErosionDialotionx,1,np.pi/180,100)
	ylines = cv2.HoughLines(sobelErosionDialotiony,1,np.pi/180,100)
	if(ylines is not None):
	if(xlines is not None):
	print('sobel - Erosion Followed by Dialiton')
	c = ylines.size/xlines.size
	return c

	def Approach3(img):
	c = 0.00
	img = cv2.imread(path_to_folder + "\\" + img)
	gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
	img_gaussian1 = cv2.GaussianBlur(gray,(3,3),0)
	img_gaussian = img_gaussian1[100:400, 100:400]
	img_sobelx = cv2.Sobel(img_gaussian,cv2.CV_8U,1,0,ksize=5)
	img_sobely = cv2.Sobel(img_gaussian,cv2.CV_8U,0,1,ksize=5)

	#Dilation followed by Erosion
	kernel = np.ones((5,5),np.uint8)
	sobelDialotionErosionx = cv2.morphologyEx(img_sobelx, cv2.MORPH_CLOSE, kernel)
	sobelDialotionErosiony = cv2.morphologyEx(img_sobely, cv2.MORPH_CLOSE, kernel)

	xlines = cv2.HoughLines(sobelDialotionErosionx,1,np.pi/180,100)
	ylines = cv2.HoughLines(sobelDialotionErosiony,1,np.pi/180,100)

	if(ylines is not None):
	if(xlines is not None):
	print('sobel - Dilation followed by Erosion')
	c = ylines.size/xlines.size
	return c

	def Approach4(img):
	c = 0.00
	minLineLength = 200
	maxLineGap = 10
	img = cv2.imread(path_to_folder + "\\" + img)
	gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
	img_gaussian1 = cv2.GaussianBlur(gray,(3,3),0)
	img_gaussian = img_gaussian1[100:400, 100:400]
	img_sobelx = cv2.Sobel(img_gaussian,cv2.CV_8U,1,0,ksize=5)
	img_sobely = cv2.Sobel(img_gaussian,cv2.CV_8U,0,1,ksize=5)
	#Compute lines using Hough Transformation
	xlines = cv2.HoughLinesP(img_sobelx,1,np.pi/180,15,minLineLength,maxLineGap)
	ylines = cv2.HoughLinesP(img_sobely,1,np.pi/180,15,minLineLength,maxLineGap)
	if(ylines is not None):
	if(xlines is not None):
	print('sobel - HoughLinesP')
	c = ylines.size/xlines.size
	return c

	filehandle = InitializeOutputfile("01")
	ComputeDPI("01", filehandle)

	filehandle = InitializeOutputfile("02")
	ComputeDPI("02", filehandle)

	filehandle = InitializeOutputfile("03")
	ComputeDPI("03", filehandle)

	filehandle = InitializeOutputfile("04")
	ComputeDPI("04", filehandle)

view raw Lines_Images.py hosted with ❤ by GitHub

Happy Coding!!!

May 25, 2018

Perspective - Microservices

Picture is worth more than explanations.

Independent Services
Deployed Independently
No bottleneck in DB layer
Can be Deployed in Different Servers
Each Services can be upgraded without affecting others

Reference - Link

All those traits refer these are Microservices!!!

May 23, 2018

Day #109 - PDF to JPG Conversion

ImageMagick-6.9.9-Q16 (https://legacy.imagemagick.org/script/binary-releases.php)
Python 3.5 Environment on Anaconda and OCR followed as steps listed in https://sqlandsiva.blogspot.in/2018/03/day-101-ocr-and-python.html

Steps Goto - C:\Program Files\ImageMagick-6.9.9-Q16> in Administrator Mode

Different command line options to translate into jpeg with sharpening, density values

convert.exe -density 300 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 D:\PetProject\OCR\pdfs\Pages\test.jpg
convert.exe -density 300 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 -sharpen 0x1.0 D:\PetProject\OCR\pdfs\Pages\test.jpg
convert.exe -density 150 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 -sharpen 0x1.0 D:\PetProject\OCR\pdfs\Pages\test.jpg

Happy Learning!!!

May 21, 2018

Day#108 - OCR for Hindi

OCR for Hindi

1. Download data from https://github.com/tesseract-ocr/tessdata/blob/3.04.00/hin.traineddata

2. Copy it to C:\opencv\Tesseract-OCR\tessdata/hin.traineddata

3. Test Data

3. Output

Code

	#https://theailearner.com/2019/05/29/creating-a-crnn-model-to-recognize-text-in-an-image-part-1/
	import sys
	import cv2
	import numpy as np
	import pytesseract
	from PIL import Image

	pytesseract.pytesseract.tesseract_cmd = "C:\\opencv\\Tesseract-OCR\\tesseract"
	#Update the path if it doesn't work
	#pytesseract.pytesseract.tesseract_cmd = r"D:\\Tesseract\Tesseract-OCR\tesseract.exe"

	img = Image.open("E:\\Hindi2.jpg")
	print(pytesseract.image_to_string(img,lang='hin'))

view raw OCRHindi.py hosted with ❤ by GitHub

May 15, 2018

Dynamic SSIS Connection Strings

Two Important Learnings

Configure Map parameters as mentioned in below screenshots
Enable Delay Validation both at connection and package level

Connection Level Setting

Package Level Parameters and Dynamic Strings Defined

Package Level Setting

Happy Learning!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

May 31, 2018

Day #111 - OpenCV3 Feature Matching

May 30, 2018

Day #110 - Image Processing - Line Counting from Images

May 25, 2018

Perspective - Microservices

May 23, 2018

Day #109 - PDF to JPG Conversion

May 21, 2018

Day#108 - OCR for Hindi

May 15, 2018

Dynamic SSIS Connection Strings

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts