"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 31, 2019

Day #308 - Display on Rasberry PI

Finally got a 2.8 Inch LCD display for Rasberry PI. Next step is to experiment with the models with real-time situations. This LCD will help to visualize the output. Package installation commands and reference.

Happy Learning!!!

December 30, 2019

Social Media Responsibilities

How do we measure social media impact? What are the pros and cons of Social Media
Pros
  • Information Sharing
  • Connect with a larger set of population
Cons
  • Sharing information without Authenticity
  • Motive / Authenticity of people sharing information
  • Smiles / Selfies vs Reality of Life
  • Manipulative / Biased / Personalized targeted ads
  • Exploit human tendencies of feedback / Sensitized to bias 
Social Media Accountability
  • Freedom of speech vs Hurting Sentiments
  • Consequences of blackmail/threat/bullying through social media contacts
  • Consequences of any form of violence instigated through messages
  • Validating the facts/claims shared?
  • Endorsing political advertisement claims without a moral stand?
  • Biased targeting of users?
  • Depression / Suicide due to excess usage of social media?
  • Freedom of speech vs Authenticity of speech vs Intentions of information shared?





When we can’t even agree on what is real


Keep Thinking!!!

December 26, 2019

Analyzing top 25 AI companies in 2019

Analyzing top 25 AI companies listed in Link




Happy Learning AI Landscape!!!

December 24, 2019

Data Analysis vs Forensic Science

Before AI / BI it's about #exploring the data to uncover the #DataInsights. #DataAnalysis is similar to #ForensicScience. Side by Side comparison of both perspectives.


#Data and #Insights sets the #direction for successful #AI / #BI usecases #datascience #bigdata #analytics

Happy Learning!!!

December 23, 2019

Difference between SQL and NOSQL Systems

Reposting from my two-year-old Quora answer

The Key differences between them lies in the understanding CAP theorem
  • Consistency
  • Availability
  • Partition Tolerance
In layman terms. SQL systems ex-RDBMS will adhere ACID properties (Atomicity, Consistency, Isolation, Durability).
  • The datatypes, schema are predefined, You cannot store non-matching datatypes
  • To avoid dirty data, systems enforce isolation levels that govern only committed data is read (Consistency)
  • Only latest records are available, records at that point in time are not available
  • Banking Systems, ordering systems where data needs to consistent will be mostly SQL based systems where consistency is important
No-SQL systems (Not Only SQL)
  • The schema is not tightly governed, its flexible you can store different datatypes in same columns
  • These may be geographically distributed where data may be synced and eventually be consistent end of day not realtime
  • They also support point in time data, data values at a point in time can also be looked up
  • Where there is no requirement for consistency we can achieve other 2 Availability and partition tolerance
  • Since some of the ACID properties are compromised you will have high availability of this systems
It is more to do with business need to decided SQL or SQL based storage.

Happy Learning!!!

December 17, 2019

Improving Women Safety

To reduce crime against women more than strengthening laws we need to get to the root cause of issues. We need to analyze the crime data and fix the source of the problem.

We need to analyze the crime patterns based on different aspects to find the underlying patterns.

Pattern vs Solutions
  • Correlation with alcohol - How to reduce/limit alcohol consumption 
  • Correlation with education - How to reduce dropouts and improve education
  • Correlation with income category - Sustainable jobs
  • Correlation with marital status - Family aspects
  • Correlation to caste - Driven by caste / Unemployment / Dropouts
  • Correlation to age group - Social media, porn impact
  • Correlation to social behavior - Drugs / partying / Addiction
  • Correlation to job type - Government vs Private jobs vs Daily vs Organized Crimes
Education is not limited to a few years. Education is not about degrees. The real purpose of education is to unlearn and relearn things from morality/humanity perspective.

It needs complete society change, not just laws. Let's prepare a safer tomorrow by making the required changes. 

Keep Questioning!!!

December 16, 2019

Day #307 - Porting Keras to Tensorflow Lite Version

Next Task is to run all the developed models in Pi using Tensorflow Lite. I am using google colab to convert the models into lite version.

The ported models we will attempt to run in Rasberry PI as next steps

Happy Learning!!!

Day#306 - Express the SQL in pandas, TSQL in Pandas

I wanted to mimic joins, aggregation, sum whatever we do in Database with pandas. A simple storyline of Data Analysis between Employee, Department and Salary using pandas dataframes.


Everything can be done in SQL. This is a different approach to it using pandas.

Happy Learning!!!

Day #305 - Loading from Weights file HDF, saved models H5 files

We will look at
  • Vanilla Model
  • Load preexisting weights HDF5 and Continue
  • Load preexisting model H5 and Continue

Results

Option #1 - Vanilla Model
Option #2 - Continue from Saved Weights
Option #3 - Continue from Saved Model H5 File


Happy Learning!!!

Day #304 - Analysis of Deep Fashion Dataset - LandmarkDetection

Three different poses

8 localization points only 4 is not null in all columns
Different visibility value options (0,1,2) - visibility: v=2 visible; v=1 occlusion; v=0 not labeled

One model for each category we need to do
The top-level has 3 generic categories:
  • 1: “top” (upper-body clothes such as jackets, sweaters, tees, etc.)
  • 2: “bottom” (lower-body clothes such as jeans, shorts, skirts, etc.)
  • 3: “long” (full-body clothes such as dresses, coats, robes, etc.)
The implementation is defined in paper - Link
Data - Link

Architecture Implementation

Data Analysis of the Dataset for Non-Zero Columns

  • image_name                   0
  • landmark_visibility_1        0
  • landmark_location_x_1        0
  • landmark_location_y_1        0
  • landmark_visibility_2        0
  • landmark_location_x_2        0
  • landmark_location_y_2        0
  • landmark_visibility_3        0
  • landmark_location_x_3        0
  • landmark_location_y_3        0
  • landmark_visibility_4        0
  • landmark_location_x_4        0
  • landmark_location_y_4        0
  • landmark_visibility_5    30972
  • landmark_location_x_5    30972
  • landmark_location_y_5    30972
  • landmark_visibility_6    30972
  • landmark_location_x_6    30972
  • landmark_location_y_6    30972
  • landmark_visibility_7    73003
  • landmark_location_x_7    73003
  • landmark_location_y_7    73003
  • landmark_visibility_8    73003
  • landmark_location_x_8    73003
  • landmark_location_y_8    73003

Non-zero columns

  • landmark_visibility_1        0
  • landmark_location_x_1        0
  • landmark_location_y_1        0
  • landmark_visibility_2        0
  • landmark_location_x_2        0
  • landmark_location_y_2        0
  • landmark_visibility_3        0
  • landmark_location_x_3        0
  • landmark_location_y_3        0
  • landmark_visibility_4        0
  • landmark_location_x_4        0
  • landmark_location_y_4        0

Happy Learning!!!

Day #303 - Model Training Guidelines - Part II

Here we will look at two more additions on top of the previous post 
  • Save model h5 file after every run/epoch 
  • Add Data batching to run in smaller iterations, Leverage Sequencer


This is a template code. This can be customized for larger datasets


Happy Learning!!!

December 15, 2019

Project Learning Notes

Tracking, Counting has always been quite interesting topic for sometime. Explored this codebase link


I liked the approach of directionality based tracking. This is very needed for directionality based counting. Hoping to reuse / implement it in people counting scenarios.


My perspective is

  • Tracking by Sampling Frames (Reduce Load)
  • Use Euclidean and other attributes to track/match
  • Evaluate existing tracking built in OpenCV (Again these need frame by frame tracking)

Happy Learning!!!

December 13, 2019

Day #302 - Keras Best Practices during Training

In this post we take the raw version of code and add below features in code
  • Adding Checkpoint
  • Adding Logging
  • Plot Results
  • Restart Training from Checkpoint
  • Early Stopping

Run #1 Output (10 Epochs)


Run #2 Continue with Existing Weights (5 Epochs)


Happy Learning!!!

December 11, 2019

Day #301 - Data Batching in Keras

This post is about custom data batching using Keras. Here we override the methods of inbuilt sequence. The below example is with dummy data generation, data splitting and fetching the batch of records.



Other strategies
  • Databases -> CSV 50K Data Chunks Records -> Training and Save Checkpoint
  • Checkpoint to Save for Each run and reuse for next 50K Chunk of Data
This is a classic data fetching solution. Database can store millions of records. We can fetch each batch, export it to a CSV and use each chunk, train and save checkpoint and continue further for next run.

Happy Learning!!!

December 01, 2019

Day #300 - Lessons Learnt from Multi-Label Classification

Today is 300th Post on Data Science. It has been a long journey. Still I feel there is a lot more to catchup. Keep Learning, Keep Going.

There are different tasks involved

1. Data Collection - Fatkun Batch Download Image chrome extension to download images
2. Script to reshape images and store in a standard format
3. Simple DB script to update and prepare data

4. This base implementation was useful for model implementation link
5. Data Test Results

Happy Learning!!!