Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): September 2022

September 26, 2022

Vision Solutions - Interesting Ideas / Products / Observations / Startups

Recycle Segregation

The AMP robotic recycling system powered by an AI algorithm recognized and sorted 50 billion recycling objects, recovering all recyclables from a mixed-material stream at an accuracy of around 100 percent based entirely on image analysis.#computervision #datalabeling pic.twitter.com/pAKExe5j0K
— Keymakr (@keymakr_com) September 21, 2022

Repairs Monitoring with Vision

How #AI & #ComputerVision can be used to supervise the quality of human tasks#FutureOfWork #ArtificialIntelligence #IoT @JoannMoretti @Hana_ElSayyed @JolaBurnett @Shi4Tech @AkwyZ @CurieuxExplorer @enilev @anand_narang @labordeolivier @mvollmer1 @Fabriziobustama @PawlowskiMario pic.twitter.com/sYXId2Bb1U
— Franco Ronconi 🇮🇹 (@FrRonconi) September 14, 2022

Number of Bolts repaired
Duration of Repair

Monitoring and Surveillance

#ComputerVision is redefining surveillance
by @Seeker #AI #Drones #SmartCities #Privacy #Safety

cc: @frronconi pic.twitter.com/IuaG4hwUK4
— Ronald van Loon (@Ronald_vanLoon) September 9, 2022

Arms and Guns Detection

Is this a good approach?

US startup ZeroEyes has developed an AI-based computer vision system that detects and mitigates active shooters at schools.

Deployed at multiple sites.

More info on Superinnovators: https://t.co/TGwD19BMz6 #computervision #schoolshootings #innovation pic.twitter.com/fJYAhvCuxc
— Charles Carter (@cctech100) September 20, 2022

Animals counting

Using AI to count Sheep 🐑🐑

Now livestock farmers and producers can count livestock with the help of #computervision and #AI.

Source: Plainsight#technology #innovation #artificialintelligence pic.twitter.com/cggEIqexoE
— cimplify.ai (@cimplifyai) September 23, 2022

I love the region of interest

Segmentation
Contours
Counting
Line of Separation

Market Cap and Growth

Deep Learning Market is Anticipated to Reach US$ 31.3 Billion by 2027 Registering a CAGR of 25.8%. #artificialintelligence #deeplearning #machinelearning #insight #data #tech #technology #innovation #computerscience #computervision #datascience #engineering #developers pic.twitter.com/HKfE20X1pk
— Welcome.AI (@welcomeai) August 31, 2022

Vision Startups

Cool Computer Vision Startups in 2022 https://t.co/H8GiWEEI6F via @MarkTechPost #ArtificialIntelligence #computervision #100DaysOfCode pic.twitter.com/H4YaQ2pxou
— MARKTECHPOST.COM (@Marktechpost) September 14, 2022

Keep Exploring!!!!

September 25, 2022

Document Q&A

From OCR, Document Extraction, Understanding, Hugging face has come a long way :)

DocQnA Pipeline very impressive

Results

Keep Exploring!!!

TesserOCR
MMOCR
OCRmypdf
EasyOCR
PaddleOCR
Kraken
OCRopus
PyOCR
Tesseract

OCR, Fintech and Usecase

Keep Learning!!!

September 21, 2022

Good Utils - Json Analysis, PDF Parsing, OCR

Visualize json and responses - jsoncrack.com
Analyze pdf documents -pdfplumber
Parsr - Turn your documents into data!

Hope to leverage them in NLP work

Text and Table Extraction

Test Detectron2 , DiT, Layout Parser for Document Images

DiT: Self-Supervised Pre-Training for Document Image Transformer

Invoice Processing with Azure OCR and GPT-4: An In Depth Step-by-Step Guide

Extracting Data from Charts and Graphs: The OCR Challenge Solution

Keep Exploring!!!

September 18, 2022

Every 4 weeks, Personally at least every week we need to observe, learn, and collect domain, data, AI / ML, competitive products, and offerings to have the big picture, proactive, and stay ahead of the learning curve.

Did I complete my technical debts?
Did we brainstorm/experiment with new ideas?
Did I learn something new?
Did I contribute to any POV?
What was my new technical learning?
Did I improve upon 5% better in my deliverables
What areas can I improve personally
What bugs/issues could we avoid?
Did I get the required help for code review/discussion
What expectations do I need to reset to align/be better in this sprint
More than daily updates, Are we slow/fast, Did we improve upon execution time in iterations
Did I reuse/get new perspectives on old backlogs

Some Red Flags

Lots of planning vs Very little experimentation
Experimentation vs Accuracy vs Scale
Everyday status vs Slow progress on Minor issues
Repeating same mistakes
Reactive and not Proactive
Plan ahead vs Last minute corrections
Missing Self initiative to pick beyond tasks

It is not about I am competing with someone else. It is more about Am I better than my previous version.

Ref - Link

Ref Link

Keep Thinking!!!

September 15, 2022

Interesting Read - Startup Ideas Evaluation

These observations reflect products built without understanding the market, tech, users, and future aspects. The 360-degree analysis seems missing. I do it for tech, I do it because I like to do it. I also have gone thru the bias.

Ref - Link

What you do, Why you do
Markets / Current Landscape
What do you do differently
How much are you challenging current players

Being a consultant to see both sides is important

Ability to influence determines growth. Confidence – Competence – Conviction - Authenticity
Confidence comes with Plan, Preparation, Awareness, Communication, Preparedness
Listening helps to Collect data. Consulting = Minimal talking, Maximum Listening
Presentation - Purpose – Drivers – Case Studies
Open-ended questions let you know why. What – Gold, Why - Platinum

I would still rate Khan Academy >>> Byju's. Money is not the only factor of success.

Ref - Link

Template - Link

From AI / ML Standpoint

Conversations to Connect / Click

When to sell?

Spotting the right time/opportunity to put a conversation around AI / ML is the key. When there are needs, its easy to convince

Internal Alignment

Team alignment on Data, Analytics, AI/ML is key

AI / ML

Broader future context
Better customer retention / experience
Revenue point of view spot AI / ML opportunities
AI / ML opportunities in the broader context of competition
Staying Competitive

Keep Thinking!!!!

September 12, 2022

Virtual Try on AR vs Vision

Paper #1 - Augmented Reality based Virtual Dressing Room using Unity3D

AR Advantages

AR kit recognizes and tracks a person’s movements using an iOS device’s rear camera.
A12 bionic chip running iOS 13
3D’s Human Body Tracking library
Model your mesh in a standard T-pose.
3D skeleton was generated which imitates human motion in real time
IOS mobile platform.

In a nutshell, an augmented reality virtual fitting room mobile app for iOS is being developed in conjunction with a human body recognition and motion tracking model.

In your 3D-modeling software package (such as Maya, Cinema4D, or Modo), import the provided skeleton and the custom mesh model that you want to use with AR kit’s Motion Capture functionality

You character should be modeled in a T-pose, your scene should contain only one bind pose, and the rotational values of each joint in your hierarchy should match the values in the provided example skeleton

AR kit’s body-tracking functionality requires models to be in a specific format

To superimpose the clothing over the user's body, we needed a 3D model of the garment, which we created using Blender

Demos - Unity Virtual Fitting Room Full Tutorial + Cloth | Unity, Realtime Tracking, Realsense, Kinect, etc

Face Tracking - Unity Documentation

Augmented Reality for Everyone - Full Course

GO VIRTUAL: NOW YOU CAN BE YOUR OWN STYLE AVATAR - Link

Dense Human Pose Estimation In The Wild - Link

Demo - Link

DensePose - Dense human pose estimation aims at mapping all human pixels of an RGB image to the 3D surface of the human body.

Deep Fashion3D: Dataset & Benchmark for Virtual Clothing Try-On and More

Deep Fashion3D contains 2,078 3D garment models reconstructed from real-world garments in 10 different clothing categories

Paper - Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images

We present Deep Fashion3D, a large-scale repository of 3D clothing models reconstructed from real garments

Sample reconstruction - Link

Paper - Body Capture and Marker-based Garment Reconstruction

Our goal is to generate a 3D model of a person wearing a garment, from multiview RGB videos

Garment Digitizing: Digitize the garment into a 3D flat mesh.
Marker Tracking: Track the markers and obtain their 3D locations.
Body Capture: Reconstruct a body model with accurate shape and pose.
Garment Reconstruction: Virtually wear the garment on the body

Paper - Image-based Dress-up System

Skeleton Setting - To establish the necessary correspondences between the model and garment images, we let the user manually select joint positions on the input image with simplified skeleton structures

Paper - Virtual Fitting Solution using 3D Human Modelling and Garments

Combining multiple deep learning models to create a system that uses all of the models' inferences and produces a single output
Create a pipeline for integrating 2D based virtual garment fitting solutions in conjunction with 3D reconstruction networks, to visualize the virtual tryon results in 3D

In Skeleton-based modelling, the identification and analysis of X, Y coordinates
3D posture estimate X, Y, and Z coordinates of human body joints are used
OpenPose initially finds key-points that correspond to each person in the image
DensePose to estimate 3D postures from a 2D image on a surface-based human model
Densepose is implemented using multiple combinations of neural networks that combine the regression and classification tasks
DeepCut provides an approach for detecting and estimating the human body pose
Graphonomy uses graph transfer learning to generate universal human parsing for several human parsing tasks and using annotations in a better way
LIP_JPPNet This is deep learning model for body part segmentation and pose detection built using TensorFlow. This network is trained on Look into People (LIP) Dataset
CIHP_PGN This neural network provides instance level human parsing by using part grouping network.
Semantic part segmentation, Instance-aware edge detection, refinement, and Instance partition process

Pose Detection Component
OpenPose Network Architecture
Geometric Matching Module

September 11, 2022

Colab Pro+ Segmentation Experiments

The good thing is - Runs in the background, Close the browser, and re-login after a few hours

Cons -

Was not as fast as I expected it to complete the training
Background execution terminates after 24 hours, Does not support long training
Sometimes console was busy and unresponsive

For 50$ this is the cheapest option at the moment :)

Experiment #1 - 20K Training Images, 5K Test Images

GPU, High RAM
TPU, High RAM
Colab pro+
Batch size - 100

Failed

Experiment #2 - 10K Training Images, 2K Test Images

GPU, High RAM
Colab pro+
Batch size - 75

Failed

Experiment #3 - 10K Training Images, 2K Test Images

GPU, High RAM
Colab pro+
Batch size - 25

In Progress

Experiment #4 - 10K Training Images, 2K Test Images

Segmentation 512 x 512
Batch Size = 15
GPU High RAM

1- 3 Epochs, Incremental Iterations

20 Epochs Seems ok, Not so bad

Everything seems to balance batch size, incremental training, GPU, and TPU based on the problem statement

Segmentation on 512 x 512 seems to have better performance compared to segmentation on 224 x 224

Continue Experiments!!!

Infra Costs - Training Large Datasets - Deep Learning

Infra and Costs - Link

Insights - Link

Infra - GTX 1080 TI GPUs and cuDNN
Dataset - 220,000 carefully annotated hair images

Infra Providers - Cirrascale, Lambda

Training large models - Link

4 days to train GPT-3 on 1,024x NVIDIA A100 GPUs.
With each A100 GPU priced at $9,900, we’re talking almost $10,000,000 to setup a cluster that large
you can rent A100 GPUs from public cloud providers like Google Cloud, but at $2.933908 per hour, that still adds up to $2,451,526.58 to run 1,024 A100 GPUs for 34 days
Each TITAN X, for example, costs roughly $3,000

Keep Exploring!!!

#Life as a #DeepNetwork

At different stages we need to balance #weights #education, #opportunities, #focus and #consistency

Different outputs we need are #Money, #Health, #Family, #Relationship, Security

Similar to #backprop as long as keep adjusting the weights we can get optimal output

Deeper the layers and more focus, Higher the success :)

Keep Exploring!!!

September 10, 2022

Weekend Opinions

On and Off you have to distract yourself when you have problems with development vs sales vs customer expectations.

One interesting link

The difference between a manager and an expert is a Deep understanding of algos. Sometimes I feel expert, sometimes a consultant, sometimes a manager. The roles and needs keep rotating. i feel comfortable switching Database + Vision not being a full stack developer.

Another Read that hit my mind is link

Concepts need to start with

Analogy
Purpose
Relatable terms
Mathematical Explanation
Working examples

We can't learn everything with just maths, formulas, or package and function names. What is missed is blending it and simplifying it. Yes, It is an art to explain in a relatable way. This is the reason we had to spend so many blogs to find one good read :)

Keep Questioning!!!

The Dangers of Digital World

Food quality in reality vs Beautiful pics on the menu
Dark stores cannot work everywhere, Now it's moving towards, Affordable location = more dark stores, lower income group = buy from Kirana stores
Preserved food / Packed food delivered in 10 mins
Fake Electronic items / Discounted Duplicate Items
There is significantly less responsibility on aggregators with regard to product quality
One restaurant registered hundreds of fake listings
The middlemen/source can manipulate/amplify prices
Education Mafia - Everything is available freely - Khan Academy / Youtube but we pay because we see some actor endorsing it

Endless Free Internet - Low-cost internet has taken away focus and discipline

More Internet = Less Sleep = Less Focus = More Frustration = Depression

Kids today are exposed to all forms of digital and drug addiction, We pretend to be unaware but it costs everything!!!

Keep Thinking!!!

September 05, 2022

Cheatsheets = Quick Learning ?

Cheatsheets = Quick Learning?

I love #cheatsheets, every time when I see them, I memorize them as one-liners, Algos, and Definitions. Every time I try out something, I learn from #errors I get, I learn from #visualization I see for topics, CNN visualizer, Tensorflow playground, and some blogs with different technical perspectives help me to understand 'Oh I didn't understand it in this context.

Like backpropagation, if we go by cheatsheets it will be a #overfitting model where you can answer #first question but fail to generalize for unseen patterns. I take time to read, code, try, connect, unlearn, and relearn, it takes time and more epochs and backprop to tune errors. Deep Learning or Shallow understanding,

Choose wisely!!!

September 03, 2022

Conditional Random Fields - NER Notes

NER - A named entity is a “real-world object” that’s assigned a name – for example, a person, a country, a product, or a book title.
CRF can take context into account.
Each prediction is dependent only on its immediate neighbors.
CRF model to predict the conditional probability of Y by training the model parameters
CRF builds transition probability that accounts for the likelihood of observing each transition between labels in the sequence
CRF is a discriminative approach, It builds both likely transition and unlikely transitions
A Discriminative model ‌models the decision boundary between the classes

Ref - Link

Feature Functions - Notes

Ref - Link

NER Approaches

Keep Exploring!!!

September 01, 2022

NLTK Basics

By default, NLTK (Natural Language Toolkit) includes a list of 40 stop words, including: “a”, “an”, “the”, “of”, “in”, etc. (List Link)

Term Frequency: Number of times a word appears in a document/number of words in the document.
Document Frequency: Number of documents a word appears across documents

BOW (Bag of Words)

BOW - Vector representing sentence
BOW does not preserve order
BOW Fails when

Food was good, not bad at all
Food was bad, not good at all

spaCy Full name is (spelled correctly)

One-hot encodings vs Word embeddings

One-hot encodings - Represent each word, you will create a zero vector with length equal to the vocabulary, then place a one in the index that corresponds to the word.
One Hot Encoding - the relationship is not captured by the one-hot encoding

Word embeddings - An embedding is a dense vector of floating point values. Words with similar meanings have similar vectors
The basic idea for training is that words occurring in similar contexts have similar meanings.

Ref - Link

Keep Exploring!!!

September 26, 2022

September 25, 2022

September 21, 2022

September 18, 2022

September 15, 2022

September 12, 2022

September 11, 2022

September 10, 2022

September 05, 2022

September 03, 2022

September 01, 2022

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts