"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

June 30, 2018

Day #117 - Yolo Object Detection

This post is using Yolo for object detection

Step 1 - Download Repo - https://github.com/thtrieu/darkflow
Step 2 - Install Commands - https://github.com/markjay4k/YOLO-series/blob/master/part1%20-%20setup%20YOLO.ipynb

Step 3 - Install Cython
Step 4 - Build Downloaded Code

Step 5 - Made Changes to following lines in code

Sample Example Code
Happy Learning!!!

June 29, 2018

Day #116 - Fake news detector

There are different kinds of Fake news

  • Manipulating Event reported date, place and parties involved - - Fabricated news everything Fake
  • Spreading Hatred against Specific groups
  • Hate Messages

Solution Approach

  • Step #1 - Using Naive Bayes to classify them Hate messages 
  • Step #2 - Using Multiple sources to validate news details (Location, Date, news type verification)
  • Step #3 - Validate the parties involved by comparing articles on same news
  • Step #4 - On Major variations from reported news alert
  • Step #5 - Facebook or google they will have large volumes of corpus to train, find more specific details on Fake news - Origination, Place, Time, ethnicity, Age group, Target Audience, Depending on them you can pre-screen news from such audience
Happy AI Learning!!!

June 25, 2018

Day #115 - Image Template Comparison

Template Matching, OpevCV3, Python 3 Environment

Happy Learning!!!

June 09, 2018

Day #114 - OCR with Tesseract

Architecture for Tesseract link

Step 1 - Download Latest version for windows link

Step 2 - Pdf to Image Conversion using ImageMagik (Refer Previous Posts)

convert.exe -density 300 -trim D:\OCR\TestData\DEC17.pdf -quality 100 D:\OCR\TestData\DEC17.jpg

convert.exe -density 300 -trim D:\OCR\TestData\DEC17.pdf -quality 60 D:\OCR\TestData\DEC17.jpg

Step 3 - Tesseract Options using LSTM Networks

#use LSTM Mode
tesseract.exe --oem 2 D:\OCR\TestData\DEC17-0.jpg D:\OCR\TestData\DEC171.csv

tesseract.exe --oem 2 D:\OCR\TestData\DEC17-0.jpg D:\OCR\TestData\ tsv

Happy Learning!!!

June 08, 2018

Reading Research papers

Very insightful, practical and detailed. Its all about focus, repetitive efforts and passion to learn. Amazing lecture.

From Siraj Session - Link 

Goal Oriented Reading Strategy

Phase I

  • Read title, Abstract 
  • Use as Overview
  • Skim through sections / sub sections
  • No math in Pass I
  • Correlate to known learning's
  • Related Papers

Phase II

  • Understand Mathematics
  • Get Concept of Maths Formula
  • Evaluate reports, repeatable results
  • Download code repository
  • Replicate results
  • Additional resources on web to summarize texts
  • Output - Notes, Helper Images

Phase III

  • Maths
  • Every detail of Math
  • Break down equations
  • Wikipedia references
  • Replicate paper programmatically using equations / settings

Key is 'Never Give Up', 'Turn your frustrations into Fuel', 'Ask for Help'

How to Write a Research Paper

From Siraj Session - Link

  • Remind to Stay Positive and Belief
  • Start with Questions to arrive at Topic
  • Broad / Specific
  • Find answers for those questions
  • Well articulated and laser focussed on solution for problem
  • Collect data on topic
  • Remember to search through variey of sources, critical in assessment
  • Start Learning from Sources
  • Make Generalizations
  • Find common ideas across projects
  • Synthesis of Data
  • Use it as Thesis
  • Defend belief based on series of compelling experiments
  • Ask for validation
  • Get Super basic functional baseline
  • Write outline from common subsections between different papers
  • Generalized form of different papers
  • Sections for your projects
  • Document research process and results
  • 5 to 10 Pages ideal length
  • Never Plagarize
  • Omnigraph, inscape

Happy Learning, Reading and Writing!!!

Day #113 - Text Summarization Notes

This post is Summary for my reference on Text Summarization from Siraj Raval Session

Happy Learning!!!

June 01, 2018

Day #112 - Web Scraping

For topic modelling, had to scrap a few websites to obtain data for the same.

Happy Learning!!!

May 31, 2018

Day #111 - OpenCV3 Feature Matching

On Windows platform performed the following installations

pip install opencv-python
pip install opencv-contrib-python

Happy Learning!!!

May 30, 2018

Day #110 - Image Processing - Line Counting from Images

Learning's from recent exposure to working on images, texture and identifying the line count in vertical and horizontal axis. OpenCV was useful to arrive at different approaches

Happy Coding!!!

May 25, 2018

Perspective - Microservices

Picture is worth more than explanations.

  • Independent Services
  • Deployed Independently
  • No bottleneck in DB layer
  • Can be Deployed in Different Servers
  • Each Services can be upgraded without affecting others

Reference - Link

All those traits refer these are Microservices!!!

May 23, 2018

Day #109 - PDF to JPG Conversion

  • ImageMagick-6.9.9-Q16 (https://legacy.imagemagick.org/script/binary-releases.php)
  • Python 3.5 Environment on Anaconda and OCR followed as steps listed in https://sqlandsiva.blogspot.in/2018/03/day-101-ocr-and-python.html
Steps Goto - C:\Program Files\ImageMagick-6.9.9-Q16> in Administrator Mode

Different command line options to translate into jpeg with sharpening, density values
  • convert.exe -density 300 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 D:\PetProject\OCR\pdfs\Pages\test.jpg
  • convert.exe -density 300 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 -sharpen 0x1.0 D:\PetProject\OCR\pdfs\Pages\test.jpg
  • convert.exe -density 150 -trim D:\PetProject\OCR\pdfs\TestA.pdf -quality 100 -sharpen 0x1.0 D:\PetProject\OCR\pdfs\Pages\test.jpg
Happy Learning!!!

May 21, 2018

Day#108 - OCR for Hindi

OCR for Hindi

1. Download data from https://github.com/tesseract-ocr/tessdata/blob/3.04.00/hin.traineddata

2. Copy it to C:\opencv\Tesseract-OCR\tessdata/hin.traineddata

3. Test Data

3. Output


More Reads - Link, Link1

Happy Learning!!!

May 15, 2018

Dynamic SSIS Connection Strings

Two Important Learnings

  • Configure Map parameters as mentioned in below screenshots
  • Enable Delay Validation both at connection and package level
Connection Level Setting

Package Level Parameters and Dynamic Strings Defined

Package Level Setting

Happy Learning!!!

April 29, 2018

Sunday Perspective - Data Scientist

Happy Learning!!!

April 28, 2018

Day #107 - Data Analysis and Exploration

Happy Learning!!!

April 26, 2018

Day #106 - OpenCV for Python3

Finally installed OpenCV for python3 following steps in link 

Anaconda 3 Distribution works fine perfectly!
pip install opencv-python

Happy Learning!!!

April 16, 2018

Day #105 - Ensemble Tips and Tricks

Diversity based on Algorithms
  • 2~3 gradient boosted trees (lightgb, xgboost, catboost)
  • Neural networks (Keras, Pytorch)
  • 1~2 Extra trees (Random Forest)
  • 1-2 knn models
Diversity based on inputdata
  • Categorical features (one hot, label encoding, target encoding)
  • Numerical features (Outliers, binning, derivatives, percentiles)
  • Interactions (col1*/+-col2),groupby,unsupervised
Subsequent level tips
  • GDM with depth 3
  • Linear models with high regularization
  • Extra trees
Happy Learning!!!

April 12, 2018

Smart City Analytics

A picture is worth thousand words. This picture tweeted World Economic Forum is worth million words. A great example of smart city with the list of use cases. Every use case has implied analytics on top of it. A summary of it listed together

Area Concept Explanation Learning Analytics Role
Buildings Green Building Roof Top garden, Asorb CO2 produce Oxygen Architecture to reduce Co2, Better lighting reduce energy consumption
Buildings Building management Automation and Optimization of Hearing, Lighting, ventilation Analytics to predict energy consumption, cooler maintenance, predict failures, Anamoly detection
Buildings Fire safety Intelligent extinguishing customized based on design / plan / materials / goods in it based on items kept decide on type and quantity and type of extinguisher
Environment Permeter Access Control Access, Moniutoring CCTV, Intruder detection Video Analytics, Face Recognition, Eye Detection
Environment RoofTop Wind Turbine For high rise buildings Renewable energy generation Monitor the turbine performance, Identify any anamoly in device operations to predict failures, Forecast on performance or energy generation based on trends and seasons
Environment Air Pollution Control Control Co2 Emissions of Factories Predict growing CO2 Emissions based on increasing number of factories, vehicles
Environment Building Integrated Photovoltaics Solar Panel integrated into building fabric to replace conventional materials Renewable energy generation Device Monitoring, Predictive maintenance
Environment / Infrastructure Smart Grid Energy Consumption and Monitoring Predict , Forecast Energy needs
Environment/Buildings Chemical Leakage Detection Detecting Leakages / Wastes of factories in rivers Identify / Penalize the faulty / corrupt ones Water Quality Analysis to detect and identify the waste induction points / points of failure
Environment/Buildings Real time Traffic updates Instant Traffic updates sent to smart phones to help route planning Already google traffic does crowd sourcing of Data and predicts traffic updates / delays
Infrastructure Vertical Axis Wind Turbines Vertical twisted wind turbines Similar to solar power lights across city with renewable energy sources
Infrastructure Structural health Monitor building infrastructure and condition Analytics to classify, eliminate old / low quality buildings
Life Waste Management Monitoring waste levels to optimize / refuse collection routes Efficient garbage management Video Analytics for garbage classification and segmentation
Life Smart Parking Parking Monitoring and management Planning, Video Analytics to identify empty parking slots
Life Earthquake Detection Analytics to find anamoly in readings to predict EarthQuarke
Utilities Potable Water Monitoring Monitor ground water levels, contamination, forecast the availability
Utilities Wifi In Metro / Public Places Classify, cluster usage based on gender / age group / economic levels
Utilities Water Leakage Detection Sensors to identify / monitor leaks in water supply
Utilities Landslide prevention Soil monitoring for early detection and prevention
Utilities Fast Lane / Smart Signals Optimize / divert based on real time traffic situations
Transport Electrical Transport Renewable enegy based transport systems

Happy Learning!!!

April 11, 2018

Day #104 - Working with DropBox

Python script to upload and download from dropbox. Install the dropbox package, get access token by signing up for dropbox. Create an app in dropbox console - Create an App on dropbox console (https://www.dropbox.com/developers/apps) and obtain access token

Happy Learning!!!

April 10, 2018

Day #103 - Hosting a flask Rest API

This post is on hosting a flask / rest API. This requires flask package. In Python 3.4 It was straight forward implementation

To execute run the code in python console - python example_gist.py

The following links were useful
link1, link2, chromeextension

Happy Learning!!!