"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

June 28, 2021

CVPR Paper Reads - Large-scale Product Recognition

CVPR Paper Reads - Large-scale Product Recognition

Paper #1 - 1st Place Solution to CVPR 2021 AliProducts Challenge: Large-scale Product Recognition

Key Lessons

  • The final solution employed 11 models including three backbones: efficientnet, efficientnetv2, and nfnet.
  • Small models were trained with less epochs and large models were trained with more epoch

Data Augmentation

  • RandomCrop: 448*448
  • RandomRotation: ±30°
  • RandomHorizontalFlip: p=0.5

Paper #2 - Solution for Large-scale Long-tailed Recognition with Noisy Labels

Key Lessons

  • CNNs and Transformer, including ResNeSt, EfficientNetV2, and DeiT
  • Ensemble three different network architectures with ImageNet pretrained weights, including ResNeSt-101, DeiT-small and EfficientNevV2-m.

Paper #3 - An Effective Ensemble Method for AliProducts Challenge: Large-scale Product Recognition

Key Lessons

  • The AliProducts dataset consists of more than 3M images of nearly 50K different products.
  • All networks are initialized with pre-trained weights on ImageNet and trained with cross entropy loss.
  • As for image augmentation, we use RandomCrop, RandomHorizontalFlip as well as Nomalization

Paper #4 - RETAIL VISION WORKSHOP 2021 - PRODUCT PRICING CHALLENGE(4TH PLACE SOLUTION)

Key Lessons

  • First step involves detecting the prices present on shelves. A single class called "pricing"  (Bounding Box)
  • Second step is to detect and recognize text present inside the pricing. Google Vision API was used for text detection and recognition
  • Price Text Box Extraction: The text box with the max area containing only number was chosen the price box(or integer part of the price). 
  • Price Text Cleaning, Price Rounding off

Summary -  As we can see a mix of techniques custom detection, OCR comes into play for item price area detection, parsing, cleaning, and product match based on both text, price, value. We could also do a similar image / key points match too.

More reads - Link

Keep Thinking!!!

June 27, 2021

Metrics of product building

However, we have sprints/process/domain expertise. 

  • What is the measure of how much we know the product perspective, vision vs implementation
  • After few sprints how do product and business feel about the outcome, Are they in line with what is developed vs envisioned
  • Everyone has a way of conveying/thinking their perspective, How do we call out / communicate all the business flows and ensure we keep everyone on the same page
  • How much of the team believes / inline with the storyline and implementation
  • Tech stack never ends, business domain learning, new trends keep popping up. 

I have less time left, Its better to fight selective battles, I recognize my time is less when I near 40s

Keep Thinking!!!

June 26, 2021

Tech Leader vs Business Leader

A business leader who knows about domain but not about technology cannot sell solution capabilities in terms of technology effectively. To appreciate the technical capabilities you need a certain level of tech acumen. Today the line of tech and business knowledge keep overlapping

What happens here?

  • Afraid of experimenting 
  • Look for expertise outside
  • You will get stuck to evaluate/promote the internal tech team

A tech leader who does not understand the business will not be able to succeed in his role or with his team. If you do not develop from domain perspective you will ultimately burn out with a pile of unsold inventory of tech solutions.

What happens here?

  • Build prototype in all areas
  • Have diverse focus no deep expertise
  • Lacking business insights will lead to aborted projects not meeting customer expectations

I have observed both types of leadership resulting in burnout and missing innovation.

All this will result in impacting the company culture, delivery, and work-life balance. We have abundant tech talent but less collaboration and vision. Working on one idea for 5 years will give you more refinement/clarity/focus vs working on 10 ideas in 5 years. Expertise, Experience, Perspectives come from time. Every course cannot directly give you the knowledge you need unless you experiment/modify and keep adding more insights/lessons. 

Take time, Build your own path!!!

Keep Thinking!!!





Convert avi to mp4 in ffmpeg for streamlit

Streamlit didn't work with avi. ffmpeg tool worked for converting from avi to mp4

What did not work 

ffmpeg -i video_Raw.avi -c:v copy -c:a copy -y video_Raw_New.mp4

What Worked 

ffmpeg -y -i video_Raw.avi -vcodec libx264 video_Raw_New.mp4

Keep Learning!!!


Notes from Azure Synapse Training

Lesson #1 - Tables – Indexes Best Practices

  • Clustered Columnstore index (Default Primary) - Highest level of data compression. Best overall query performance
  • Clustered index (Primary) - Performant for looking up a single to few rows
  • Heap (Primary) - Faster loading and landing temporary data. Best for small lookup tables
  • Nonclustered indexes (Secondary) - Enable ordering of multiple columns in a table. Allows multiple nonclustered on a single table. Can be created on any of the above primary indexes. More performant lookup queries
Queries with the following patterns typically run faster with ordered CCI:
  • The queries have equality, inequality, or range predicates
  • The predicate columns and the ordered CCI columns are the same.
  • The predicate columns are used in the same order as the column ordinal of ordered CCI columns.
  • Caching of results, Enable caching at DB level then query level - Resultcachehit flag returns the value whether it was reused

Fact table primarily CCI as we would run large aggregations based on dimensions so CCI becomes a choice for fact tables. 

Lesson #2 - Distributed table design recommendations

  • Hash Distribution: Large fact tables exceeding several GBs with frequent inserts should use a hash distribution.
  • Round Robin Distribution: Potentially useful tables created from raw input. Temporary staging tables used in data preparation.
  • Replicated Tables: Lookup tables that range in size from 100’s MBs to 1.5 GBs should be replicated. Works best when table size is less than 2 GB compressed.

Lesson #3 - Result-set caching

Cache the results of a query from SQL pool storage. This enables interactive response times for repetitive queries against tables with infrequent data changes. The result-set cache persists even if SQL pool is paused and resumed later. 

Cache Checks

You can tell if a query was executed with a result cache hit or miss by querying sys.pdw_request_steps for commands where value is like ‘%DWResultCacheDb%’

Lesson #4 - SQL Data Classification is a new feature in the Public Preview, that:   

  • Automatically discovers columns containing potentially sensitive data
  • It provides a simple way to review and apply the classification recommendations through the Azure portal.
  • The sensitive data labels are persisted in the database (metadata attributes) and it audits and detects access to the sensitive data.
  • We offer built-in set of labels and information types, however customers can chose to define custom labels across Azure tenant using Azure Security Center

Lesson #5 - Dynamic Data Masking

  • Prevent abuse of sensitive data by hiding it from users
  • Easy configuration in new Azure Portal
  • Policy-driven at table and column level, for a defined set of users
  • Data masking applied in real-time to query results based on policy
  • Multiple masking functions available, such as full or partial, for various sensitive data categories (credit card numbers, SSN, etc.)
Lesson #6 - Spark vs SQL Server (Memory Handling)

Keep in mind spark uses memory much in the same way as sql server uses the buffer pool by storing frequently used objects in memory it reduces overall I/O and improves performance in large joins, sort and aggregates contrast this with a traditional hadoop based architecture which relies heavily on writing data out to disk between steps.

Every concept technical maps as an advancement or some sort of limitation which existed in place. Compared to SQL 2008 where you don't have so much of these feature synapse has beautifully evolved as a good environment for real-time / ML / big data handling capability for reporting / Ml recommendations/lakehouse / real-time BI system. Gone are the days of month-end jobs or Data sync jobs. 

All good lessons :) Fantastic Features!!!

June 21, 2021

My Perspective of Interviews

Few things I keep a tab on from a time/candidate perspective - Listening to candidate answers, asking for quantifiable data, cover all areas during the time.

  • Project discussions - To bring the best out of candidates I ask them to pick their best projects to demonstrate architecture challenges, performance issues, deployment.
  • Introduce certain scenarios/brainstorm to get perspectives from candidates. I look at areas they are able to explore / with constraints the alternatives they bring to the table.

Make it a good experience for the candidate, We all keep learning. Be better than yesterday.  

Keep Thinking!!!

Interesting observations tesseract

While extracting digits from analog meters below two links we use to get the values

Lesson #1 - Setting the path to a folder vs complete executable, Minor thing took a while since not using it often

Ref - Link

Lesson #2 - Very useful for different situations on how it can be interpreted, 11 worked best. 6 was ok


Ref - Link

Keep Exploring!!!

June 14, 2021

Quick Research Paper Reads - Retail - Supply Chain

Price Optimization in Fashion E-commerce

Key Notes

  • Key parameters - product display page, MRP and the discounted price, clickthrough rate (CTR) & conversion
  • To maximize revenue, we need to predict the quantity sold of all products at any given price
  • Another significant challenge is cannibalization among products
  • We overcame this problem by running the model at a category level and creating features at a brand level, which can take into account cannibalization
  • To solve it, the Linear Programming optimization technique


Feature Engineering

    
Linear Programming
Now we need to choose one of these three prices such that the net revenue is maximized.

Online Data Sources
  • Clickstream data: this contained all user activity such as clicks, carts, orders, etc.
  • Product Catalog: this contained details of a product like brand, color, price, and other attributes related to the product.
  • Price data: this contained the price and the quantity sold of a product at hour level granularity.
  • Sort Rank: this contained search rank and the corresponding scores for all the live products on the platform
Key Notes
  • The task of assortment planning is to determine the optimal subset of k products to be stocked in each store so that the assortment is localized to the preferences of the customers shopping in that store.
  • Broadly there are three aspects to assortment planning, (1) the choice of the demand model, (2) estimating the parameters of the chosen demand model and (3) using the demand estimates in
  • an assortment optimization setup.
  • The forecast demand will then be used in a suitable stochastic optimization algorithm to do the assortment planning.
In the age based model for demand forecasting of fashion items, the demand of an article i in store s at time t, is formulated as:

June 13, 2021

Domain + Tech + Impact

My role has always been innovating / initiatives/domain knowledge-driven based use cases for more revenue opportunities / lowering the cost of operations / better customer service. Recollecting some of my milestone projects.

Reverse Logistics

Customer Service Projects

  • Better delivery insights/emails to measure status at each leg
  • Provided more touchpoints for better repair/refurbishment delivery

Warranty rewrite

  • Rewrite warranty with traceability to new rules
  • Data lineage for different warranty rules
  • Tracking between repairs/exchanges

Vision for Retail Innovation

  • RFID, EAS, and legacy devices for people counting, loss prevention vs vision-based solutions
  • Ideate, prototype, demonstrate, patent. I wasn't there to collaborate or see how intel scaled it up but happy for the ideas that went till NRF / products

Startups collaboration

  • Vision for ad effectiveness, Measuring sales impact from digital displays
  • Vision for logo damage assessments - measuring logos to be replaced due to wear and tear in aircrafts
  • Vision for Agriculture - Duplicate vendor detection and alert

Startups pitch and failed attempts. Vision doesn't work alone. It has to be a combination of vision + data to be a successful product.

Keep Thinking!!!

June 12, 2021

AL / ML Work - The three categories to focus on

Research perspective

  • Apply new approaches + solve new problems
  • Differentiation in terms of approach / performance / patenting / renewed opportunities

Practitioners

  • Apply ML Use cases in current projects
  • Bring in the project insights/awareness within the Project team
  • Differentiation in terms of moving towards ML adoption/implementation

Business Perspective

  • Contribute to AI-driven business lens focused use cases in the domain
  • Build industry focused generic solutions 
  • Differentiation in terms of wider community impact / collaboration with business / clients
Mastering the latest tech vs Coding vs ML ops vs Building domain knowledge, Scale as much as you can, or narrow down and pick your battles!!!

Keep Thinking!!!


June 10, 2021

The challenges to put ML models in production (Healthcare)

 Very good thread, Summarizing insights

Observations from papers ?

  • None of the 415 ML papers published on the subject in 2020 was usable. Not a single one!
  • Black small square 2212 papers, Black small square 415 after initial screening, Black small square 62 chosen for detailed analysis, Black small square 0 with potential for clinical use
  • Many papers were using very small datasets often collected from a single hospital - not enough for real evaluation
  • Some papers used a dataset that contained non-COVID images from children and COVID images from adults. These methods probably learned to distinguish children from adults
  • Training and testing on the same data 
  • Many papers failed to disclose the amount of data they were tested or important aspects of how their models work leading to poor reproducibility and biased results
  • Many papers didn't even consult with radiologists.
  • Rushing to publish results based on small and bad quality datasets undermines the credibility of ML
  • At some point people start figuring out how to fine tune on the test set
  • Dataset is not diverse enough and bias-free
  • Authors find that covid-19 detectors often attend to the position of the shoulders and not the lungs. Models can easily learn shortcuts as opposed to robust features

Take everything with a pinch of salt. Real world data is not kaggle data. Kaggle does not reflect the reality or quality or the challenges we spot on data.

Keep Exploring!!!

How to get download your favorite video, audio from youtube

  1. Download from youtube using python - youtube-dl package
  2. Clip a portion of video - vlc advanced editor. Steps - link, C:\Users\username\Videos
  3. Convert video to audio - ffmpeg

ffmpeg -i sample.avi -q:a 0 -map a sample.mp3

Steps - link

Keep Checking!!!


June 09, 2021

Retail and Supply Chain Reads

Retail and Supply Chain Reads.

Learning Tech vs Knowing current trends in each domain vs Getting insights from different reports vs Building your own insights/predictions, The cycle never ends!!!

Retail Trends Playbook 2020

Key Notes

  • Retailers can transform data into dollars by using customer information to determine better marketing, service and product opportunities.
  • Data intelligence is a key ingredient to the customer experience
  • success in this new era is dependent on understanding and anticipating the needs of customers at every stage of the retail journey.

Consumer trends

  • Customers are Comfortable Sharing Data In Exchange For Better Experiences
  • Customers expect Personalization At Every Stage
  • Customers want Knowledgeable Staff On-Hand For Service And Support
  • Data-Powered Warehouses Reduce Cost Of Ownership
  • Consumers Spend More With Better Service

Supply Chain Resilience Report 2021

Key Notes

  • The number of organizations who report on disruptions is continuing to increase
  • COVID-19 has increased the number of organizations using technology for supply chain management
  • Over half of organizations admitted COVID-19 increased their use of technology for supply chain mapping
  • Cross border land transport has been the primary cause of logistics disruption in 2020
  • Management have become more committed to managing supply chain risk

Beyond COVID-19: Supply Chain Resilience Holds Key to Recovery

Key Notes

  • COVID-19 has now unleashed a global supply chain crisis across a huge number of organizations, stemming from a lack of understanding and flexibility of the multiple layers of their global supply chains and a lack of diversification in their sourcing strategies.
  • Where possible, diversified supply-chains across companies and geographies greatly
  • reduce exposure and if firms are tied to single suppliers, risks from supply-chain disruptions 
  • Big data analytics can assist firms in streamlining their supplier selection process, cloud-computing is increasingly being used to facilitate and manage supplier relationships and logistics and shipping processes can be greatly enhanced through automation and the internet of things. 

Supply-chain recovery in coronavirus times—plan for now and the future

Key Notes






Keep connecting the dots !!! Diversified and predicting risks / setting up a Digital connected supply chain!!!

June 08, 2021

Complicating computer vision use cases

Sometimes we design solutions because we have to use the existing environment. They may be feasible simpler / better solutions. We need to plan computer vision from hardware/software / better solutions not over overengineered solutions

Use Case #1 - Fisheye camera for dwell time computation for curbside pickup

Implementation Challenges

  • Fisheye has different sizes for far/near objects
  • Side view at the entrance
  • Top view at the exit
  • One model for entrance / one model for exit
  • Occlusion problems
  • Detecting vehicles types / Detecting attributes
  • Compare and match and count 

Ideal way

  • Edge Device near entry/exit capture license plate

Lesson learned - This needs a custom model for every type of camera. Not the best way to implement, I did call out but it was beyond my influence to change the aspects. I was partially happy with the approach although we could do it much better and simpler.

Use Case #2 - Vehicle counting in a four-lane junction

Environment challenges

  • Service roads
  • No clear division of vehicles
  • Vehicle flow in all directions
  • Limited space between vehicles
  • Occlusion
  • All kinds of vehicles/colors/models

Ideal way

  • Number plate license number recognition 
  • Having vehicle counting camera set on individual roads than at junction to have a clear line of crossing for people counting

Lesson learned - Indian roads are very different. We have to design solutions from hardware, model implementation, challenges. It's best to design considering all constraints and real-world scenarios. Vision is not going to solve everything unless you are not smart enough on how to use vision and get results in a quick time.

Object Tracking - Tracking is memory intensive, In many ways, I try to avoid tracking because it needs exhaustive computation and frame-by-frame operations. Demo projects work great with frame-by-frame tracking but we cannot do the same on terabytes of videos.

Keep Thinking!!!

June 07, 2021

What all to learn ? The endless to-do list

First, it was Docker, then Next it was Kubernetes. KFServing + Vision + Mysql for an end-to-end architecture on GCP. Next for project purpose worked on Vision + AWS. Worked on AWS Redshit for AI / ML and Stored Procedures. Then Again jumped on to Azure Synapse, Azure Cognitive Vision + Media Services. On and off Picked up on optimization/routing/pulp / seems interesting. Domain was mostly CVM, Supply Chain, Retail and picking up on trends

When you look back from time to time there is a lot of stuff to catch up on the vision, new approaches, ML is giving more MLOps, Feature stores, Vision new tracking techniques, Everything transformer, bert. Expertise in Vision vs Data vs NLP everything needs time, experimentation, and constant exploration.

Domain Learning, Cloud Exposure, Data Learning, ML Techniques, Vision updates :) The list is never-ending!!!

Bookmark your learning's time to time. Looking back it feels like explored everything a bit but still so many unknowns. As always sometimes all these dots will help to paint a big picture. After an experiment you know a bit of implementation, The curiosity goes once you get the crux of it. The same TSQL, SQL Basics, performance tuning translated from RDBMS in 2000 to CAP theorem, the advent of NoSQL, large scale parallel processing with Hadoop, Columnar indexes, JSON models, Making everything read-only with Spark RDD. What is a limitation today that will be a motivation for something tomorrow?

What we know is a spark, curiosity and the ability to build the perspective connecting old and new things, learning the required pieces building up a good business use case is the satisfaction of learning!!!

Keep Going!!!

OpenCV Experiments

Experiment with

  • Logging
  • Threads
  • Error Handling
  • Writing text on Frame with grey background






Keep Exploring!!!

June 04, 2021

ffmpeg

What is ffmpeg 

  • libraries and programs for handling video, audio, and other multimedia files and streams

Why it is used

  • FFmpeg program itself, designed for command-line-based processing of video and audio files

What purpose it serves best

  • ffmpeg project tools and programs form the basis of a ton of video and audio software. VLC, Google Chrome, and MPC all use ffmpeg libraries to decode video.

Download and Install from Link

In System variables locate and select Path row, click Edit and add value c:\ffmpeg\bin.

Live stream local video

ffmpeg -i sample.mp4 -v 0 -vcodec mpeg4 -f mpegts udp://127.0.0.1:23000

ffplay udp://127.0.0.1:23000

Keep Thinking!!!

June 01, 2021

Tabu Search

Ref1 - Link 

Notes

  • Iterative procedure 
  • Initial feasible solution to next solution in the current neighbourhood

Ref2 - Link

Notes

  • The basic form of Tabu Search (TS) is founded on ideas proposed by Fred Glover (1977, 1986).
  • Ordinary local or neighborhood search, proceeding iteratively from one point (solution) to another until a chosen termination criterion is satisfied.
  • The most common attributive memory approaches are recency based memory and frequency-based memory
  • Recency-based memory is the most common memory structure used in TS implementations.
  • A key element of the adaptive memory framework of tabu search is to create a balance between search intensification and diversification.

Ref3 - Link

  • A Comparative Study of Tabu Search and Simulated Annealing for Traveling Salesman Problem 
  • Tabu List: To prevent the process from cycling in a small set of solutions, some attribute of recently visited solutions is stored in a Tabu List, which prevents their occurrence for a limited period.
  • Diversification: The use of frequency information is used to penalize nonimproving moves by assigning a larger penalty
  • Neighborhood: A neighborhood to a given solution is defined as any other solution that is obtained by a pair wise exchange of any two nodes in the solution
Book - Algorithms For Dummies

Notes
AVOIDING REPEATS USING TABU SEARCH
Tabu Search does the following
  • Allows use of a pejorative solution for a few times to see whether moving away from the local solution can help the search find a better path to the best solution
  • Remembers the solutions that the search tries and forbids it from using them anymore
  • Creates a long-term or short-term memory of Tabu solutions by modifying the length of the queue used to store past solutions. 
Book - Metaheuristics and Evolutionary Algorithms
The TS is an enhancement of local search (LS) methods.
  • TS solved the problem of convergence to local optima experienced with LS methods by allowing movements to non‐improving solutions when there is no better solution near the current solution. 
  • Unlike the LS, in the TS the best neighboring point replaces the searching point even if it is worse than the current searching point
  • The previous searching point is memorized as tabu
  • If the new searching point is better than the best point, the new point replaces the best point; otherwise the best point remains unchanged.
  • Neighboring points near the new searching point are generated. 
More Reads

Keep Thinking!!!