"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

February 28, 2024

Video Summarization

Learning to Summarize Videos by Contrasting Clips

  1. Feature Extractor
  2. Score Predictor
  3. Summary Extractor
  4. Highlight detection as a special case of the summarization task

Video Summarization: Towards Entity-Aware Captions - Summarizing video content into a natural language description

Video Summarization Using Deep Neural Networks: A Survey

Option #1

  • Feature Extractor
  • Score Predictor
  • Summary Extractor
  • Highlight detection as a special case of the summarization task

Option #2

  • Frame 1 - Feature Vector
  • Frame 2 - Feature Vector2
  • Frame 3 - Feature Vector 3
  • Feature vector score comparison to pick / unpick
  • Object score comparison to pick / unpick

Other Techniques

  • Hashing based
  • Clustering based
  • Feature based

Keep Exploring!!!



February 27, 2024

Which #GenAI use case will succeed ?

In 2006, I was part of a Reverse Logistics team at Microsoft working on the launch of a new product, the Zune—a competitor to the iPod. The Zune was somewhat boxy and heavy. At first glance, it was clear that the iPod and Zune were worlds apart. Despite tight deadlines for setting up the supply chain and tracking every aspect, we all know the Zune's life ended by 2012.

Choosing the right use case is crucial; Zune vs. iPod is a classic example. It's not only about applying AI but also about selecting the right use cases that bring tangible benefits and align with our business strategy.

Courses may teach us the fundamentals, like LEGO blocks, but the same challenges persist today in terms of supply chain visibility and customer experience. Choose the right use case, invest time in understanding your data and domain. Out of the several applicable AI use cases within domain context, focus on those that hold relevance to your business needs, domain, data, to derive real value.

Copycat use cases will not work unless they are relevant and meaningful for your business.

Stay observant of industry trends, and align your AI initiatives with your strategic goals. 

#AI #Strategy #Innovation #TechHistory #Business #Data #ProductDevelopment #Microsoft #Zune #iPod #DataScience #GenAI

Keep Exploring!!!

February 16, 2024

AI News, Tools, Observations

Good discussion on observations and experiments with #CreativeAI, Summary, and adding my perspectives

1. AI-generated visuals: Frustration over production inconsistency despite enjoying creative collaboration. 

2. #CreativeExploration by embracing the unpredictability in current ideation tools broadening horizons and seeking inspiration. 

3. Lack of emotional connection; you don't draw anymore. 

4. Things are advancing rapidly, reminiscent of the text AI scene pre-#ChatGPT. Rapid progress is observed, yet we will need to do more. 

5. #Startups is never ending in Creative Exploration Image-space - Midjourney, RunwayML, alpacaml, clad.ai, dreamlook.ai, photoroom, neurallove, letsenhance, topazlabs. 

6. The latest one today morning is #OpenAI Sora - capable of generating a minute of high-fidelity video. #GenAI #Startups Currently inspiration oriented, seems #OpenAI will consolidate this space with #DALLE5/6 :). Sometimes we need to patiently wait to pick a winner and move forward. #GenAI #CreativeAI #DigitalTransformation #generativeai #midjourney #dalle3 #imagegeneration 

AI is now a CIO boardroom Topic


Keep Exploring!!!

February 13, 2024

LLM Guardrail Notes

Existing Implementation Solutions

  • Llama Guard
  • Nvidia NeMo
  • Guardrails AI

Measure / Validation for 

  • Free from Unintended Response
  • Fairness
  • Privacy
  • Hallucination

Ref - Link1

NeMo Guardrails contains two key components

  • Input moderation, also referred as jailbreak rail, aims to detect potentially malicious user messages before reaching the dialogue system.
  • Output moderation aims to detect whether the LLM responses are legal, ethical, and not harmful prior to being returned to the user.

Challenges on Designing Guardrails

  • Custom to domain
  • Custom to use cases
  • Training by Zero-shot and Few-shot Prompting
  • Topic relevance, content safety, and application security, ultimately standardizing the behavior of LLMs.

The Llama Guard Safety Taxonomy & Risk Guidelines

  • Violence & Hate
  • Sexual Content
  • Guns & Illegal Weapons
  • Regulated or Controlled Substances
  • Suicide & Self Harm
  • Criminal Planning

Evaluation of LLMs

  • Reliability
  • Safety
  • Usability
  • Compliance 

Keep Exploring!!!!

Good Read - Three trends of 2024 - Multimodal race

Good Read - Link

  • Small models - high-quality training data. Possible use cases - LLM on edge with high accuracy
  • Multimodal AI - Merge text + image + video + Audio. Making all types of content useful for creating/querying new assets. Creative Breakthru across image/video / ads/games etc
  • AI in healthcare/agriculture

For #2 - Multimodal race has a lot of players

  • Winning products across all modalities 
  • Platforms that enable creating + publishing content workflows
  • Automatic content creation, customization, and repurposing across formats and platforms

microsoft designer is stunning. 

#2024 #predictions

Keep Exploring!!!


February 12, 2024

Evolution of Data Management and Advanced Analytics: Milestones from 2004 to 2023

The technological landscape of data management and analytics has undergone significant transformations over the past two decades. From 2004 to 2006, many organizations began migrating from SQL Server 2000 to SQL Server 2005, introducing newer features and improved performance. Meanwhile, the migration to SQL Server 2008 and later to SQL Server 2012 occurred subsequently after their respective releases in 2008 and 2012.

Around the year 2009, NoSQL databases such as MongoDB, and earlier CouchDB which emerged in 2005, started challenging traditional relational database management systems (RDBMS), shifting the emphasis towards CAP theorem principles. This paradigm shift saw a movement of certain use cases away from the confines of ACID compliance towards the more flexible NoSQL solutions.

By the late 2000s, Hadoop was capturing the attention of the industry, and by 2012 it had firmly established itself as a cornerstone of the Big Data movement, driving many enterprises to incorporate Hadoop and other related technologies for managing large data sets.

In terms of analytics, machine learning (ML) applications became more prevalent and sophisticated around the mid-2010s, with 2017 witnessing a surge in diverse and powerful ML use cases.

In 2018, neural network architectures, particularly deep learning (DL), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs), began to take the lead in driving advancements in artificial intelligence.

The field continued to evolve, and by 2020, Vision Transformers (ViTs) emerged as a groundbreaking approach in the realm of computer vision, challenging the long-held dominance of CNNs, and by 2022 they were among the forefront of innovation.

Arriving at 2023, the role of Generative AI and Large Language Models (LLMs) has come to the foreground, shaping an era where AI is not merely a tool for automation but also for creativity and complex problem-solving.

Looking ahead, the machine learning landscape is expected to be a rich tapestry of ML, DL, and Generative AI applications. The decision to employ Transfer Learning, develop a custom model, or utilize an LLM will be informed by the nuanced requirements of data, domain expertise, and the specifics of each use case. As the field continues to grow and diversify, the challenge will be in effectively mapping each use case to the appropriate technology to harness the full potential of these evolving tools.

Keep Exploring!!!

Data - Use Case - Iterative Thinking - Evolving solutions

Many times learning comes from people around us. For vehicle PPF it gave a lot of insights

  • While removing all the door cladding/mirrors, the 0-2yrs exp people were putting all screws together
  • While fitting on screws, Same dimension screws were applied for two parts, There was confusion on pending screws
  • The PPF person was very focused, and the sun film person was a separate person

Three things to build a solution

Many times learning comes from people around us. For vehicle PPF it gave a lot of insights

Observation #1

  • Fresher Lens -  While removing all the door cladding/mirrors, the 0-2yrs exp people were putting all screws together
  • Experience Lens  - The person (lead) he asked them group based on the parts
  • Lesson - group problems, data logically to debug / build solution

Observation #2

  • Fresher Lens - While fitting on screws, Same dimension screws were applied for two parts One screw was perfect silver, and another was perfect black, When all screws were filled, juniors were clueless about where it had to be fitted
  • Experience Lens  - The lead was able to provide clarity to fit the connecting dots
  • Lesson - Leadership is solving with what you have, not afraid to look back and rework where it is needed

Observation #3

  • The PPF person was very focused, the sun film person was a separate person
  • Lesson - Build relationships, cannot solve all problems all alone
  • Be good at a few, use expertise when you need a great product

What does it imply in the AIML context

  • Inexperienced team members initially consolidated all components indiscriminately during disassembly. The team leader guided them to categorize components systematically, akin to structuring data for efficient algorithmic problem-solving.
  • Different screws of identical size were used interchangeably during the assembly process, leading to confusion. The leader demonstrated critical thinking, using available resources to retrospectively address the issue, a trait essential for refining machine learning models.
  • Task specialization was evident; individuals focused on PPF or sun film application roles. This mirrors the need for specialization and collaboration in AIML, leveraging cross-disciplinary expertise to enhance overall model performance.

Keep Exploring!!!

February 07, 2024

Vision Product Catalog Startups

  • Background removal
  • Super resolution
  • Image Restoration Deformation Fixes

Startups in Focus

Keep Exploring!!!


February 04, 2024

Computer Vision License Validation

Business problem: Id verification system(valid or invalid) say driving license as id. How do we go about solving this business problem using Deep learning

Input - License Id Images

Approach

  • Feature Definition
  • Defining Elements
  • Historical data
  • Labeling / Annotation

Vision

  • Problem #1 - Extract Face images
  • Problem #2 - OCR, License Id, Dates, LicenseNumber, Authority
  • Problem #3 - Detection for Signature Extracting
  • Data Validation - Blurriness - Image Sharpening / Laplacian / Sobel / Canny edge to sharpen images. Non-readable - Far / Validation - Near View

Backend Validation

  • API call
  • Face Match
  • Similarity Score
  • Output - Valid License

Keep Exploring!!!

CNN Experiments - Solutions - Building End to End Solutions

CNN Experiments - Solutions - Building End-to-End Solutions

CNN Experiments

  • Minimum Exp Without Aug
  • Data Aug + CNN Model 
  • Data Aug + CNN Model (Deeper Layers) - Few more convolution blocks
  • Data Aug + CNN Model (Deeper Layers) - Few more convolution blocks + (Dropouts / Regularizer / Adjusting Learning rate)

To Launch a Product / Build Model things to consider

  • Pre-requisites
  • Data Collection
  • Data Pre-processing and transformation
  • Data Imbalances / Data Augmentation 
  • Modelling
  • Deployment
  • Monitoring
  • Real-time data training
  • Collaborate with Healthcare prof
  • Keep updating the model

We have 95% Accuracy, Remaining 5% how do we handle

  • Similarity scores
  • Ensemble methods
  • Human in loop

Keep Exploring!!!

February 03, 2024

Can ML Solve this Problem ? Vision Problem - How to approach Damage Detection in Mobile Phones ?

How do you approach Damage Detection in Mobile Phones? 

Detecting defects on phones during exchange

Question - Can it be done with ML? 

  • Student Answers - DL Vision

Question - Data Prerequisites?

Student Answers

  • Physical damage to vision
  • Images of the phone from various angles
  • Software issues
  • System diagnostics
  • Images of cracked screens

Question - Model building

Student Answers

  • Cnn classification 2 classes
  • Damaged, not damaged
  • Multiclass - damaged, degrees of damage (so that can identify price negotiation)
  • inside parts, maybe images of phone when it is not damaged?

Real-world Way of Solving 

My Recommendation

  • Detect Type of Phone, - Flip / Smart Phone
  • Brand Detection (OCR)
  • Image Similarity (Good Screen vs Similarity score to what you have)
  • Line Detection - Count Cracks on Screen
  • Segmentation to detect %% of cracked area
  • Measure the deformation in the picture
  • Yes / NO - Cracks
  • Low / Medium / High
  • Centre, Lower, Top
This is not a single model for all needs. This has to be based on brands, models, categories, Defect types, Data Collection, Labelling and Phased Adoption.

Keep Exploring!!!

February 01, 2024

Video Recommendation System

  • Interest-based recommendations by signup
  • Content-based - System that follows videos watched
    • Similar videos based on content
    • As interest changes, content changes, adaptive strategy
  • Collaborative, Recommendations based on other people similar to me
    • Watch history based on users in clusters
    • A model trained as a batch job 
  • Two-tower approach (based on neural network)
  • Batch, Online training, Ranking
  • Model updated in real-time
    • Recent changes are updated in real-time

  • Vectors from video data - Indexing videos


  • Videos - Index creation - Vector Embeddings
  • Online + Offline Systems





Keep Exploring!!!

When product outshines tools

Great insight on how they built sivi

  • Multifaceted and multi-layered nature of graphic design
  • Template-based design tools
  • Template replaced designs are neither cohesive nor relevant
  • Atomic design principles, composes designs from scratch across infinite dimensions and spanning over 72 languages

Ref - Link

Keep Learning Product and Tech from Customer Needs :)


Startup Objectives

Startup Objectives 

  • Think problems not tools
  • Clear Goals
  • Team / Solopreneur
  • Tools / Inputs
  • Data to Collect  / External Sources
  • Data Collection

Keep Exploring!!!

My Consulting + Product Journey

  • AI Roadmap - Data Analysis, Availability, Staged use case Adoption, Roadmap for 3PL, Beauty, Fashion, Retail, Media
  • Products - Conceptualized, Solution Development, Implementation of Vision-based products in Agriculture, Fashion, FMCG
  • Training - SME for AI / ML Training for product managers. Completed 4 batches covering use cases in Retail, Fashion, FinTech, Energy
  • Domains - 3PL, Reverse Logistics, Retail - Planogram, Inventory Management, Loss Prevention, E-commerce
  • Production Implementation - Recommendations, Vision Solutions, Forecasting, GenAI Adoption

Will share a few more specific examples coming weeks :)


Keep Exploring!!!