Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): March 2021

March 31, 2021

Applications of Machine Learning in the Supply Chain

Key Notes

AI is the new electricity
ML is at hype cycle

Realtime AI

Dynamic routing - dynamically learn best route
Worker / Picker assessment
Optimal pricing through real time market feedback
Forecasting - Prophet

Dynamic Routing

The traditional approach - Regression
Online learning-based approach
Exploration vs Exploitation trade-off
Routing under uncertainty
Minimize cost on average
Optimization problem
Dynamic routing
Identify new routes based on the current status

AI, Machine Learning & Supply Chain // Manuel Davy, CEO of Vekia

Key Notes

Global, Agile
Collect information of Stores, Products, Location
Demand forecasting - at different timescales

Using Graph + Machine Learning to Optimize Logistics in Supply Chain

Keep Thinking!!!

March 26, 2021

Weekend Reads - Roadmap for Transportation Savings

Keynotes

Manage holistically, Transportation across Enterprise holistically
Inbound, Outbound, Facility
Mapping it to career capacity with advanced optimization technologies
Getting the best transportation solution, Optimize preferred integrated DC network
Map capacity to Freight flow, Run few empty miles
Turn data into information, Timely fashion
Build baseline as is cost, Restructure, reoptimize what does cost look like
Overall complete management portfolio
Getting visibility is vital

Talking Logistics: Technology Trends in 2021 and Beyond

Sharing Fleet Capacity Webinar

Happy Learning!!!

March 25, 2021

Why I may never work for a startup

I was thinking about what motivates me

Technology does not motivate me, The same CRUD(Create / Read / Update / Delete) can be done in million ways and dozen tools. All I care about is some tech that would fit my tech need.
My mind keeps wandering on multiple ideas. I don't pick one idea and sit on it forever. Sometimes you read papers, you try some blog code, examples. It is a mix, collect lot of ideas and let your creativity and interest pick what you like
It is a race to compete for technology and domain knowledge. I would rather focus on solving business problems and learn required tech on a need basis than trying to master tech and go for domain
In 20 years so much has changed in tech. To observe the new business process, how technology shapes the 2.0, 3.0 supply chain is great learning. Applying a mix of both domain and tech to be on par with the evolving landscape is my interest
Health is going down these days. I would rather be comfortable and creative than pushing my limits. I wish to be a learner, coder. You cannot win all the battles. Pick and chose
At some point, I need to write books, more blogs on my lessons both personal, professional. In a way, we are in a mentally pressured space. Too much competition and clarity and purpose and being clear needs constant revival and focused mindset
I have lost interest in money. I have connected with dozens of startups. I personally know the landscape, the tech they are solving. In a way, I am happy with the tech and business problems. Able to see and see how things evolve is more important than titles and company.
I believe teaching is equally important, mastering fundamentals. Adding value does not come from being online or replying to every email. It comes with deliverables, passion to do personally. We may not excel in everything but few things close to our heart we can be at our best
Now looking back on my 24 hours deployment, those critical moments in my work I do cherish those awards and being able to stand in times of crisis. Always domain knowledge helped me shape up to solve things and push for them.
Learn more, Live the best you can, regret less, add less guilt. In the end, only memories matter, Halfway done. Keep the last half a bit more contented and satisfied.

Keep going!!!

March 22, 2021

Learning Notes - Azure Synapse Analytics

Everything gets evolved into the next level. Earlier it was SQL DW, Now it has evolved into MPP with ML components.

Azure Synapse Analytics – Azure Synapse Analytics is a new offering available on Microsoft Azure. It’s a combination of SQL Data warehouse (MPP offering), Apache Spark, pipelines, and a workspace to manage this entire ecosystem

What is dedicated SQL pool (formerly SQL DW) in Azure Synapse Analytics?

Dedicated SQL pool (formerly SQL DW) stores data in relational tables with columnar storage
PolyBase uses standard T-SQL queries to bring the data into dedicated SQL pool
Dedicated SQL pool uses PolyBase to query the big data stores.

Reference - Link

Architecture - Link

Dedicated SQL pool (formerly SQL DW) uses a node-based architecture.

Applications connect and issue T-SQL commands to a Control node. The Control node hosts the distributed query engine, which optimizes queries for parallel processing, and then passes operations to Compute nodes to do their work in parallel.

Similar to Map Reduce here you see distributed parallel processing. Hope to experiment few more examples in the next posts.

Building real-time enterprise analytics solutions with Azure Synapse Analytics

Key Notes

Dedicated SQL pools
Serverless consumption pools
Azure Synapse Analytics

Workspace Features

SQL Pools
Spark Pools
Pipelines - Integration and Orchestration
All resources governed by common security model
Connected service to expand synapse
Linked services for Data Integration

Demo 1

Synapse Analytics workspace

Demo 2

Azure Synapse and Azure ML
Synapse Notebook
Hummingbird generates in onnx format
Connect to AzureML Workspace

Demo 3

Use the model in Synapse workspace

Happy Learning!!!

March 07, 2021

Datasets Discussions

OpenEDS: Open Eye Dataset

Keynotes

Open Eye Dataset, of eye-images captured using a virtual-reality (VR) head mounted display mounted with two synchronized eyefacing cameras
12,759 images with pixel-level annotations for key eye-regions: iris, pupil and sclera
252,690 unlabelled eye-images

Metadata data per participant:

Age (19-65), sex (male/female), usage of glasses - (yes/no);
Corneal topography
kaggle Eye Dataset link
Hand and Palm Dataset

OVERHEAD MNIST: A BENCHMARK SATELLITE DATASET

More Reads

HandSeg: An Automatically Labeled Dataset for Hand Segmentation from Depth Images

Curated the overhead imagery from multiple public sources

UC Merced Land Use Dataset

There are 100 images for each of the following classes:

agricultural, airplane, baseballdiamond, beach, buildings, chaparral, denseresidential, forest, freeway, golfcourse, harbor, intersection, mediumresidential, mobilehomepark, overpass, parkinglot, river, runway, sparseresidential, storagetanks, tenniscourt

We need more mnist for different domains.

Keep Thinking!!!

More open source Datasets

To boost more powerful ML models, we need quality datasets, crowdsourced free datasets. More than code/logic data is key. We need to democratize data to build better models.

mnist for plants
mnist for vehicles
mnist for fishes
mnist for animals
mnist for ships
mnist for toops
mnist for equipements
mnist for highways
mnist for indian vegetables
mnist of indian pets
mnist of railways
mnist for Trucks
mnist for Gender and Age bucketization
mnist for footwear types
mnist for highway symbols
mnist for airport safety signals
mnist for skin issues
Deepfake based marketing
Deepfake based face swapping
mnist for Indian Cusines
mnist for beauty Items
mnist for tamil

This needs more collective work for better well-equipped models.

Keep Thinking!!!

Preparing for certification

A bit of confusion, learning for ideas implementation vs learning for certification. Hoping to make an honest attempt to cover. I am poor in MCQ always so slow and study revise, post, blog and see if it works :)

The handbook is key - Link

Building and training neural network models using TensorFlow 2.x
Image classification - CNN, ImageDataGenerator
Natural language processing (NLP) - binary categorization, multi-class categorization, LSTM
Time series, sequences and predictions

Youtube playlist - Link1

Good Reference of materials - Link

Example codes - Link

For every topic

Refer TensorFlow documentation
Git Examples
Example codes
Summary of concepts, learning
Pycharm familiarity

Books

Data Science on the Google Cloud Platform - Valliappa Lakshmanan
Machine Learning Design Patterns - Valliappa Lakshmanan, Sara Robinson, Michael Munn

Start Slowly and Keep Going!!!

March 01, 2021

Back to Basics - Fundamentals - RNN - Transformers

It needs a bit more careful *attention* to understand the crux of transformers. This lecture was useful

Slides - Link

Session -

Transfer Learning

Use Neural Network on imagenet and finetune on custom data
Better performance than anything else

Convert words to vectors

One hot encoding
Scales poorly with vocabulary size
Sparse and high dimensional
Map one hot to dense vectors (Embedding matrix)
Finding Embedding matrix - Learn as part of tasks
Learn the Language model
Training on large corpus of text - wikipedia
N-Grams, Sliding Window forming rows
Binary classification - 0 / 1 - Neighbouring word or not

NLP Imagenet moment - Elmo / ULMfit

ELMO - bidirectional stack LSTM
ULMfit

Good Paper Read - SQuAD: 100,000+ Questions for Machine Comprehension of Text

Attention

Only attention no LSTM
Self-attention, positional encoding, Layer normalization
Attention and Fully Connected Layers

Self Attention

Input sequence of vectors
Output weighted sum of input sequence

Learn weights

Compute attention weight for its own output
Compute every other vector to compute attention weights for its own output y_i (query)
Compare to every other vector to compute attention weight w_ij for output y_j (key)
Summed with other vectors to form the results of the attention weighted sum (value)

Multihead attention

Weight matrices - query, key, value weights
Multiple heads of attention just mean learning different sets of query, key and value matrices simultaneously

Transformer

Self attention layer - layer normalization - dense layer

Layer Normalization

Data scaling, weight initialization
Rest things between uniform mean and standard deviation

Position Embedding

Word embedding depends on word
Position embedding depends on position
Combine both and run through transformers
Both position and content reasoned

Attention is all you need

Translation
Encoder - Decoder architecture

GPT - Generative pretrained transformer

Generating text
ELMo, ULMFIT
Preceeding words
GPT2 1.5 Billion parameters

BERT

Bidirectional encoder representations from transformers

T5 - Text to Text Transfer Transformer

Input and output as text streams
11 billion parameters

Keep Thinking!!!

About Me and Disclaimer

Welcome Visitor,
I have 20 years of experience (Coder - Emprical Learner - Teacher). I am currently working on Data Analytics (Video-Image-Text-Data) / Database / BI space. I dabble with "Data". Ping me or send a request to connect if what I do appeals to you and you want to talk about it (Data Science / Databases / Deep Learning / Architecture / Design Discussions / Consulting Projects/ Machine Learning Training's/ Strategic Leadership Roles).
Personal Goal - Reach / Teach up to 10 Million Students through various mediums (Catalyst between Academics and Industry)
My request to readers, Hope you find the posts, code snippets, notes helpful, please share your learning with others. We can only grow only by learning and teaching.

6+ years in AI, AI experience working on Image, Video, Text, Numbers - Data

15+ years in Databases

10+ in developing, deploying, monitoring large scale solutions in Supply Chain, Retail

Its my personal blog. The objective of this blog is to bookmark/share my learning's. Posts reflect my opinions, perspectives and interests. Blog post presented are my personal views and do not represent my employer's view. I have acknowledged all posts with References/Bookmarks.

For questions/feedback/career opportunities/training / consulting assignments/mentoring - please drop a note to sivaram2k10(at)gmail(dot)com
Coach / Code / Innovate

A blogpost a day keeps your thinking going.

March 31, 2021

March 26, 2021

March 25, 2021

March 22, 2021

March 07, 2021

March 01, 2021

About Me

What is your Expertise

Search This Blog

Git Code Repository

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts