"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

September 26, 2022

Vision Solutions - Interesting Ideas / Products / Observations / Startups

Recycle Segregation



Repairs Monitoring with Vision

 

  • Number of Bolts repaired
  • Duration of Repair

Monitoring and Surveillance

 

Arms and Guns Detection


 

Animals counting


 

I love the region of interest

  • Segmentation
  • Contours
  • Counting
  • Line of Separation

Market Cap and Growth

 

Vision Startups


 


Keep Exploring!!!! 

September 25, 2022

Document Q&A

From OCR, Document Extraction, Understanding, Hugging face has come a long way :)

DocQnA Pipeline very impressive




Results


Keep Exploring!!!

  • TesserOCR
  • MMOCR
  • OCRmypdf
  • EasyOCR
  • PaddleOCR
  • Kraken
  • OCRopus
  • PyOCR
  • Tesseract


Keep Learning!!!

September 18, 2022

Sprint Retrospectives

Every 4 weeks, Personally at least every week we need to observe, learn, and collect domain, data, AI / ML, competitive products, and offerings to have the big picture, proactive, and stay ahead of the learning curve.

  • Did I complete my technical debts?
  • Did we brainstorm/experiment with new ideas?
  • Did I learn something new? 
  • Did I contribute to any POV?
  • What was my new technical learning?
  • Did I improve upon 5% better in my deliverables
  • What areas can I improve personally
  • What bugs/issues could we avoid?
  • Did I get the required help for code review/discussion
  • What expectations do I need to reset to align/be better in this sprint
  • More than daily updates, Are we slow/fast, Did we improve upon execution time in iterations
  • Did I reuse/get new perspectives on old backlogs
Some Red Flags
  • Lots of planning vs Very little experimentation
  • Experimentation vs Accuracy vs Scale
  • Everyday status vs Slow progress on Minor issues
  • Repeating same mistakes
  • Reactive and not Proactive
  • Plan ahead vs Last minute corrections
  • Missing Self initiative to pick beyond tasks
It is not about I am competing with someone else. It is more about Am I better than my previous version.



Ref - Link



Ref Link

Keep Thinking!!!


September 15, 2022

Interesting Read - Startup Ideas Evaluation


These observations reflect products built without understanding the market, tech, users, and future aspects. The 360-degree analysis seems missing. I do it for tech, I do it because I like to do it. I also have gone thru the bias.

Ref - Link

  • What you do, Why you do
  • Markets / Current Landscape
  • What do you do differently
  • How much are you challenging current players

Being a consultant to see both sides is important

  • Ability to influence determines growth. Confidence – Competence – Conviction - Authenticity  
  • Confidence comes with Plan, Preparation, Awareness, Communication, Preparedness
  • Listening helps to Collect data. Consulting = Minimal talking, Maximum Listening
  • Presentation - Purpose – Drivers – Case Studies
  • Open-ended questions let you know why. What – Gold, Why - Platinum

I would still rate Khan Academy >>> Byju's. Money is not the only factor of success. 



Ref - Link


Ref - Link
Template - Link

From AI / ML Standpoint

Conversations to Connect / Click 

When to sell? 
Spotting the right time/opportunity to put a conversation around AI / ML is the key. When there are needs, its easy to convince

Internal Alignment 
Team alignment on Data, Analytics, AI/ML is key

AI / ML
  • Broader future context
  • Better customer retention / experience
  • Revenue point of view spot AI / ML opportunities
  • AI / ML opportunities in the broader context of competition 
  • Staying Competitive
Keep Thinking!!!!

September 12, 2022

Virtual Try on AR vs Vision

Virtual Try on AR vs Vision

Paper #1 - Augmented Reality based Virtual Dressing Room using Unity3D

AR Advantages

  • AR kit recognizes and tracks a person’s movements using an iOS device’s rear camera.
  • A12 bionic chip running iOS 13
  • 3D’s Human Body Tracking library
  • Model your mesh in a standard T-pose.
  • 3D skeleton was generated which imitates human motion in real time
  • IOS mobile platform.


In a nutshell, an augmented reality virtual fitting room mobile app for iOS is being developed in conjunction with a human body recognition and motion tracking model. 

In your 3D-modeling software package (such as Maya, Cinema4D, or Modo), import the provided skeleton and the custom mesh model that you want to use with AR kit’s Motion Capture functionality

You character should be modeled in a T-pose, your scene should contain only one bind pose, and the rotational values of each joint in your hierarchy should match the values in the provided example skeleton

AR kit’s body-tracking functionality requires models to be in a specific format


To superimpose the clothing over the user's body, we needed a 3D model of the garment, which we created using Blender

Demos - Unity Virtual Fitting Room Full Tutorial + Cloth | Unity, Realtime Tracking, Realsense, Kinect, etc

Face Tracking - Unity Documentation

Augmented Reality for Everyone - Full Course

GO VIRTUAL: NOW YOU CAN BE YOUR OWN STYLE AVATAR - Link

Dense Human Pose Estimation In The Wild - Link

Demo - Link


DensePose - Dense human pose estimation aims at mapping all human pixels of an RGB image to the 3D surface of the human body.

Deep Fashion3D: Dataset & Benchmark for Virtual Clothing Try-On and More

  • Deep Fashion3D contains 2,078 3D garment models reconstructed from real-world garments in 10 different clothing categories

Paper - Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images

  • We present Deep Fashion3D, a large-scale repository of 3D clothing models reconstructed from real garments



Sample reconstruction - Link

Paper - Body Capture and Marker-based Garment Reconstruction

Our goal is to generate a 3D model of a person wearing a garment, from multiview RGB videos

  • Garment Digitizing: Digitize the garment into a 3D flat mesh.
  • Marker Tracking: Track the markers and obtain their 3D locations.
  • Body Capture: Reconstruct a body model with accurate shape and pose.
  • Garment Reconstruction: Virtually wear the garment on the body

Paper - Image-based Dress-up System



  • Skeleton Setting - To establish the necessary correspondences between the model and garment images, we let the user manually select joint positions on the input image with simplified skeleton structures

Paper - Virtual Fitting Solution using 3D Human Modelling and Garments 

  • Combining multiple deep learning models to create a system that uses all of the models' inferences and produces a single output
  • Create a pipeline for integrating 2D based virtual garment fitting solutions in conjunction with 3D reconstruction networks, to visualize the virtual tryon results in 3D


  • In Skeleton-based modelling, the identification and analysis of X, Y coordinates
  • 3D posture estimate X, Y, and Z coordinates of human body joints are used
  • OpenPose initially finds key-points that correspond to each person in the image
  • DensePose to estimate 3D postures from a 2D image on a surface-based human model
  • Densepose is implemented using multiple combinations of neural networks that combine the regression and classification tasks
  • DeepCut provides an approach for detecting and estimating the human body pose
  • Graphonomy uses graph transfer learning to generate universal human parsing for several human parsing tasks and using annotations in a better way
  • LIP_JPPNet This is deep learning model for body part segmentation and pose detection built using TensorFlow. This network is trained on Look into People (LIP) Dataset
  • CIHP_PGN This neural network provides instance level human parsing by using part grouping network. 
  • Semantic part segmentation, Instance-aware edge detection, refinement, and Instance partition process






  • Pose Detection Component
  • OpenPose Network Architecture
  • Geometric Matching Module

More Git Solutions - Link

Fashion parsing models in TensorFlow

Module: MMM-WeatherDependentClothes

This MagicMirror Module displays Clothes depending on the weather forecast and your personal preferences. 

Paper - DEEP LEARNING MEETS FASHION - A LOOK INTO VIRTUAL TRY-ON SOLUTIONS




  • Multi-Garment Network’s dataset contains scans, SMPL registration, texture_maps, segmentation_maps, and multi-mesh registered garments 





Two base models are used: Multi-Garment Net[4] and Pix2Surf [2]. A third model is used implicitly by MGN and Pix2Surf as a black box. It is the 3D human body reconstruction model SMPL [18]. [2 and [4] use SMPL to create 3D garment templates and redress 3D avatar

Paper - Virtual Garment Imposition using ACGPN

GPN consists of three features. 

1. It is a semantic generation module which uses segmentation to map the human body with target clothes. 

2. Clothing wrapped module which adjusts the garment images to deformed garment mask. 

3.Content Fusion Model which adds the data to previousproduct to quickly discover the generation of the human body structure in the resulting combination layer.

  • Clothing Warping (CWM) 
  • Model Content Fusion (CFM)
  • Semantic Generation Module (SGM)

Keep Exploring!!!

September 11, 2022

Colab Pro+ Segmentation Experiments

The good thing is - Runs in the background, Close the browser, and re-login after a few hours

Cons - 

  • Was not as fast as I expected it to complete the training
  • Background execution terminates after 24 hours, Does not support long training
  • Sometimes console was busy and unresponsive

For 50$ this is the cheapest option at the moment :)



Experiment #1 - 20K Training Images, 5K Test Images

  • GPU, High RAM
  • TPU, High RAM
  • Colab pro+
  • Batch size - 100

Failed 

Experiment #2 - 10K Training Images, 2K Test Images

  • GPU, High RAM
  • Colab pro+
  • Batch size - 75

Failed 

Experiment #3 - 10K Training Images, 2K Test Images

  • GPU, High RAM
  • Colab pro+
  • Batch size - 25

In Progress



Experiment #4 - 10K Training Images, 2K Test Images

  • Segmentation 512 x 512
  • Batch Size = 15
  • GPU High RAM


1- 3 Epochs, Incremental Iterations


20 Epochs Seems ok, Not so bad



Everything seems to balance batch size, incremental training, GPU, and TPU based on the problem statement

Segmentation on 512 x 512 seems to have better performance compared to segmentation on 224 x 224

Continue Experiments!!!

Infra Costs - Training Large Datasets - Deep Learning

Infra and Costs - Link


Insights - Link
  • Infra - GTX 1080 TI GPUs and cuDNN
  • Dataset - 220,000 carefully annotated hair images
Infra Providers - Cirrascale, Lambda

Training large models - Link
  • 4 days to train GPT-3 on 1,024x NVIDIA A100 GPUs.
  • With each A100 GPU priced at $9,900, we’re talking almost $10,000,000 to setup a cluster that large
  • you can rent A100 GPUs from public cloud providers like Google Cloud, but at $2.933908 per hour, that still adds up to $2,451,526.58 to run 1,024 A100 GPUs for 34 days
  • Each TITAN X, for example, costs roughly $3,000
Keep Exploring!!!

#Life as a #DeepNetwork

At different stages we need to balance #weights #education, #opportunities, #focus and #consistency

Different outputs we need are #Money, #Health, #Family, #Relationship, Security

Similar to #backprop as long as keep adjusting the weights we can get optimal output


Deeper the layers and more focus, Higher the success :)

Keep Exploring!!!

September 10, 2022

Weekend Opinions

On and Off you have to distract yourself when you have problems with development vs sales vs customer expectations.

One interesting link 

The difference between a manager and an expert is a Deep understanding of algos. Sometimes I feel expert, sometimes a consultant, sometimes a manager. The roles and needs keep rotating. i feel comfortable switching Database + Vision not being a full stack developer.

Another Read that hit my mind is link

Concepts need to start with

  • Analogy
  • Purpose
  • Relatable terms
  • Mathematical Explanation
  • Working examples

We can't learn everything with just maths, formulas, or package and function names. What is missed is blending it and simplifying it. Yes, It is an art to explain in a relatable way. This is the reason we had to spend so many blogs to find one good read :)

Keep Questioning!!!

The Dangers of Digital World

  • Food quality in reality vs Beautiful pics on the menu
  • Dark stores cannot work everywhere, Now it's moving towards, Affordable location = more dark stores, lower income group = buy from Kirana stores
  • Preserved food / Packed food delivered in 10 mins
  • Fake Electronic items / Discounted Duplicate Items
  • There is significantly less responsibility on aggregators with regard to product quality
  • One restaurant registered hundreds of fake listings
  • The middlemen/source can manipulate/amplify prices
  • Education Mafia - Everything is available freely - Khan Academy / Youtube but we pay because we see some actor endorsing it

Endless Free Internet - Low-cost internet has taken away focus and discipline

More Internet = Less Sleep = Less Focus = More Frustration = Depression

Kids today are exposed to all forms of digital and drug addiction, We pretend to be unaware but it costs everything!!!

Keep Thinking!!!

September 05, 2022

Cheatsheets = Quick Learning ?

Cheatsheets = Quick Learning?

I love #cheatsheets, every time when I see them, I memorize them as one-liners, Algos, and Definitions. Every time I try out something, I learn from #errors I get, I learn from #visualization I see for topics, CNN visualizer, Tensorflow playground, and some blogs with different technical perspectives help me to understand 'Oh I didn't understand it in this context.

Like backpropagation, if we go by cheatsheets it will be a #overfitting model where you can answer #first question but fail to generalize for unseen patterns. I take time to read, code, try, connect, unlearn, and relearn, it takes time and more epochs and backprop to tune errors. Deep Learning or Shallow understanding, 

Choose wisely!!!


September 03, 2022

Conditional Random Fields - NER Notes

  • NER - A named entity is a “real-world object” that’s assigned a name – for example, a person, a country, a product, or a book title. 
  • CRF can take context into account.
  • Each prediction is dependent only on its immediate neighbors. 
  • CRF model to predict the conditional probability of Y by training the model parameters
  • CRF builds transition probability that accounts for the likelihood of observing each transition between labels in the sequence
  • CRF is a discriminative approach, It builds both likely transition and unlikely transitions
  • A Discriminative model ‌models the decision boundary between the classes

Ref - Link

Feature Functions - Notes

Ref - Link


Ref - Link


Ref - Link

NER Approaches


Keep Exploring!!!

September 01, 2022

NLTK Basics

 By default, NLTK (Natural Language Toolkit) includes a list of 40 stop words, including: “a”, “an”, “the”, “of”, “in”, etc. (List Link)

  • Term Frequency: Number of times a word appears in a document/number of words in the document.
  • Document Frequency: Number of documents a word appears across documents 
BOW (Bag of Words)
  • BOW - Vector representing sentence
  • BOW does not preserve order
  • BOW Fails when
    • Food was good, not bad at all
    • Food was bad, not good at all
spaCy Full name is (spelled correctly) 

One-hot encodings vs Word embeddings
  • One-hot encodings - Represent each word, you will create a zero vector with length equal to the vocabulary, then place a one in the index that corresponds to the word.
  • One Hot Encoding - the relationship is not captured by the one-hot encoding
  • Word embeddings - An embedding is a dense vector of floating point values. Words with similar meanings have similar vectors
  • The basic idea for training is that words occurring in similar contexts have similar meanings.

Ref - Link

Keep Exploring!!!