Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): CVPR Paper Reads

June 28, 2021

CVPR Paper Reads - Large-scale Product Recognition

Paper #1 - 1st Place Solution to CVPR 2021 AliProducts Challenge: Large-scale Product Recognition

Key Lessons

The final solution employed 11 models including three backbones: efficientnet, efficientnetv2, and nfnet.
Small models were trained with less epochs and large models were trained with more epoch

Data Augmentation

RandomCrop: 448*448
RandomRotation: ±30°
RandomHorizontalFlip: p=0.5

Paper #2 - Solution for Large-scale Long-tailed Recognition with Noisy Labels

Key Lessons

CNNs and Transformer, including ResNeSt, EfficientNetV2, and DeiT
Ensemble three different network architectures with ImageNet pretrained weights, including ResNeSt-101, DeiT-small and EfficientNevV2-m.

Paper #3 - An Effective Ensemble Method for AliProducts Challenge: Large-scale Product Recognition

Key Lessons

The AliProducts dataset consists of more than 3M images of nearly 50K different products.
All networks are initialized with pre-trained weights on ImageNet and trained with cross entropy loss.
As for image augmentation, we use RandomCrop, RandomHorizontalFlip as well as Nomalization

Paper #4 - RETAIL VISION WORKSHOP 2021 - PRODUCT PRICING CHALLENGE(4TH PLACE SOLUTION)

Key Lessons

First step involves detecting the prices present on shelves. A single class called "pricing" (Bounding Box)
Second step is to detect and recognize text present inside the pricing. Google Vision API was used for text detection and recognition
Price Text Box Extraction: The text box with the max area containing only number was chosen the price box(or integer part of the price).
Price Text Cleaning, Price Rounding off

Summary - As we can see a mix of techniques custom detection, OCR comes into play for item price area detection, parsing, cleaning, and product match based on both text, price, value. We could also do a similar image / key points match too.

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

June 28, 2021

CVPR Paper Reads - Large-scale Product Recognition

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts