- Download vscode and copy and run locally
- Install latest anaconda spyder
- Install office 365
- Install docker on mac for Intel Chip
- Get acquainted with Terminal in mac
- Terminal - Spyder one click
- Install gcp cloud sdk kit ./install.sh
July 30, 2021
Setup Mac Days - Day #1 - Installation
July 26, 2021
Business - Technology - Passion
Business Problems
- Know how it works
- How technology helps
- Who are pioneers in the space
- What is a basic business process, advanced
- Who are the stakeholders
Technology Problems
- What aspects tech solves (Data / Reporting / Ordering)
- What tools to pick considering the scale
- What POCs you need to work on
- What are different smaller tasks (Data Schema / Transactions / Reporting)
- Data to services development
- Bring the big picture in course of time
Passion
- Everything in Life relationships/jobs we will not get 100% we like
- You need to look at the positive side of things until you get to know the business + tech landscape to apply
- Everything has a melting point, Long list of experiences will lead to big decisions to decide how it fits in your perspective
- Think towards the end to end possible solutions/patent opportunities with a mix of research + prototype + code
- Great solutions are a collection of simple ideas + good to have improvements + learning from market/tech incorporating things that makes a difference + constantly adding small improvements
Keep Going!!!
July 24, 2021
Picking up new areas of Learning for Building Solutions
Every time when I am tasked with something new I look of for below list for knowledge gathering
- Find evidence / learning from past work
- Read through books to understand the fundamentals
- Pick / Parse interesting youtube videos and build your understanding
- Experiment code snippets wherever possible
- Build a broader context of understanding
- Refine it further based on latest trends
- Look for research papers in the topic
- Build a version of solution / approach with your own learning
Keep Going!!!
Leadership Perspectives
- Empower every leader in the team their areas/ownership/autonomy to build the best
- Ensure there are no territorial borders in solutions. Every team is open-minded and willing to discuss on best solutions
- Build the storyline from the customer perspective, What works best for the customer, Are we thinking in that direction
- A team does not win because of one-star performer only but delivering the best from everyone makes the team a self-performing team
- When teams know their purpose, goals they will perform superior to your process and tools
- Learning pillar, technology pillar, customer pillar, working product - Everything is a blend of all these pillars
- Diverse Thinking
- Humility to re-learn
- Come out of Intellectual arrogance
- Able to discuss contrarian views
- Street Kid lessons - Persistence Pays, Talk to decision-maker, Do not talk about money, More money to be taken from this person
- Creativity and Innovation needs a psychologically safe environment
- Promote different thinking, Promote diversity
- Wisdom - Ability to hold two contrarian ideas
- Supplement bookish knowledge with on-ground knowledge
- Artificial stimulants for equal participation
Growth oriented learning
- What is bad? - I can only do it
- What is learning? - He has done it well, Let me solve in my way
- Sometimes people will give you knowledge by not sharing the key areas or stressing less on key areas and more on the rest of the focus
- Intentions will stand out in the long run
- Everyone can learn everything, Be Genuine, Be King, Stay True!!!
Evolution of best practices
- Certain things you have done experimented proven best practices
- You probe certain implementations from your failures to validate the ideas and propose
- You read up / reference on similar problems solved in the domain
- As always it is a mix of code/read/share / build and experiment what best works
- Keep an open mind and balance of learning/coding to get things quick to customers and keep improving it
Keep Learning!!!
July 22, 2021
How quick to learn ?
- Know your end goal, What you want to accomplish
- Know where to lookup for good short lessons (Github / Blogs / Books / Youtube video)
- Follow the path few sources
- If all steps fail, take a break and come back again. Sometimes we need a break to get a new perspective
- Save your steps - Navigation links / Commands
- Reach out to StackOverflow / friends who are experts in that area
- Document your working Steps
- Share it to your wider audience in your books/blogs
- Everyone has a way of doing things / When you learn from some source you also need to be a learning source to someone else
A solution can be built in multiple ways, You must have a working skeleton of your thoughts before you look to optimize it.
Keep Learning!!!
July 21, 2021
Lets build a product
- Learn the required skills
- Design for scalability
- Learn from mistakes
- Stay focused for six months
- Win or Lose let's face it
- Make it bootstrapped
- Code and Learn one step at a time
- Make some wins, failures, smiles, and emotions
Let's Keep Learning!!!
July 18, 2021
Edge Deployment Optimization thoughts
- Deploy lite weight models. Deploy Quantized models
- Minimal edge processing, Detailed cloud processing
- Message loss prevention with Queues and async processing
- Transfer only selected frames instead of videos
- Offline video upload to cloud vs Real-time selected image upload for real-time notifications
July 17, 2021
RetailVisionWorkshop2021 Notes
- Physical stores are becoming digital
- Products more easily searcheable
- Better experiences at stores
- Minimize loss of sales
- Product Detection Challenges
- Pricing challenges based on data
- Price from bounding boxes
- Remove promotion content and read price content
- Country differences
- Winning Solution
- 360 degree camera to scan everything in store
- 3D construction of motion structure reconstruction
- Shelf detections in 360 cameras
- Identify Shelf level information
- Optimal robot position to capture shelf images
- Create Digital twin duplicate product
- Assortment planning for online vs offline
- Large scale embedding for product recognition
- Anchor image
- Distances corresponding to same product
- Same vs Different products
- ArcFace Loss
- Every product has centres
- Compare anchors and centres
- Feature map at top of Network
- Average over spatial dimensions
- Dynamic Shelf Reality
- New Visual designs of products
- Combination of techniques
- Similar products
- Product Category
- Clustering for similar images
ML Lessons from Production Implementation
Good Article Link. The summary is very good
For each lesson, I have added my personal observations for few points.
1. Subject matter experts have as much impact as data scientists
- Fact - "much of the challenge is getting the right data."
- Add-on - "much of the challenge is getting the right data and creating right insights / correct observations / Finding hidden patterns with domain knowledge / look beyond data what drives it"
2. The first iteration is always on the labeling taxonomy - "In vision projects having right labeled data becomes essential for detection, extraction, analysis etc.."
3. The ROI on fast feedback is huge - rapid prototyping and de-risking of projects. - "People lose confidence without seeing the value realization. Getting business involved early and understand their KPI, measure to analyze the impact of ML solution is key for the success of the project"
4. ML tools should be data-centric but model-backed - "It's a tradeoff to learn domain vs ML vs DevOps vs New tools in markets. Often end customers do not see ML as a standalone item, They get together with their existing data warehouse, You need to be practical to pick the tools which make it less complicated to integrate the current environment build a successful use case."
#datascience #analytics #domainknowledge
Keep Thinking!!!
July 15, 2021
July 11, 2021
Next Reading To-do List
- Awesome production machine learning
- Model Serving and Monitoring
- Feature Stores
- Feature Engineering Automation
- Awesome AI Guidelines
Reading to-do list never ends, Learn, Code, Experiment and add own learning's.
Keep Thinking!!!
Big Picture Needs Bigger Perspectives
Big Picture needs Big perspectives
- How you manage data vs Know the flows
- How much you understand data
- How much you avoid data duplication
- How much you have data lineage
- How much you have data privacy handled
- How decentralized, flexible, and updated records are present
Getting complete knowledge goes beyond just collecting, streaming, storing data. Every insight, domain knowledge matter.
MLops, feature Store tools - “When all you have is a hammer, everything starts to look like a nail.” Learn domain before using tools. Kaggle vs Real-world data both are different.
Data Ownership - Data Understanding
- Database Developer - Designs schema in context of performance, index, tracking
- BI Developer - Designs Schema in terms of running aggregations, Reports, Tracking, and Tracing Updates
- Machine Learning Engineer - Understands features, picks the relevant ones for Machine learning Algos
- MLops - Builds a feature store pipeline to get all the data
- Security Engineer / Data Engineer - Plays the role of making data PII, Runs before data pipeline
- With so many perspectives, How do all these folks have the same data understanding?
- How many versions of data we will keep
- Where is data dictionary or rolling updates shared and updated
- Leverage OLAP as ML Feature store, Do not complicate with multiple layers of data, versions etc..
Keep Thinking!!
Products are built to fail.
In many ways underestimate the impact of domain knowledge. Can we have one forecasting algorithm for
- Retail Product Sales
- Oil Sales
- Stocks Predictions
- Car Sales
If everything can be built just by one algorithm we would need to close all ML shops in a month. We underestimate domain knowledge and believe fancy tech and tools will have the ability to read and give all the fine-tuning.
Keep Going, Sometimes tech does not understand business, and products are built to fail.
Knowledge is
- Mapping business to tech to support futuristics ways of new business changes
- Making it flexible to scale, port, migrate
- Think Business first, Scale next, Tech at last
- Domain understanding - Technology evolves faster than we think. New forms of business evolve
- Data understanding - Know the type of data - speed / slow data
- Research paper - Insights / Blogs - Look for Leaders in the space and their tech stack, Look for research papers and insights
- Model development / Model implementation
July 10, 2021
Technology learning
Technology learning - Sometimes we overrate what we don't know. The fundamentals remain the same. Many times we do not connect past learning's. Many times Spark, SQL Server lessons we look through conceptually, examples, Implementation. Making data immutable RDDs etc..I liked this comparison - "Keep in mind spark uses memory much in the same way as sql server uses the buffer pool by storing frequently used objects in memory it reduces overall I/O and improves performance in large joins, sort and aggregates contrast this with a traditional hadoop based architecture which relies heavily on writing data out to disk between steps." Every concept technical maps as an advancement or some sort of limitation which existed in place. We need more connected learnings!!!
July 08, 2021
Computer Vision Lip Reading - Use Case Analysis
Paper #1 - Computer Vision Lip Reading
Key Notes
- Extract Face, Extract Lips / Mouth area
- Depth map with an MS Kinect sensor
- Dlib based face landmarks
- Deep network trained for numbers detection
Paper #2 - Deep Learning for Lip Reading using Audio-Visual Information for Urdu Language
Key Notes
- A sequence of T frames is used as input, and is processed by 3 layers of STCNN, each followed by a spatial max-pooling layer
- Explore as words, Digits
Lipreading Demo by Convolutional Neural Network, Link2
More Reads
- HLR-Net: A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks
- Automatic Lip-Reading System Based on Deep Convolutional Neural Network and Attention-Based Long Short-Term Memory
Keep Thinking!!!
Forecasting Notes
Forecasting Notes
Paper #1 - Time Series Forecasting Principles with Amazon Forecast
Types of Forecasting
- Long term - Strategic
- Short term - Operations day to day business
- Promotions - Seasonal based
- Impact of price, promotion on sales numbers
Key parameters in Retail
- Sku, Timestamp, units sold at sku level
- Sku metadata - color, department, size
- Price data - Price at that point in time
- Promotional information of sku
- Instock or purchased product
Could do at each SKU Level for sales forecast
Forecast (Target) - Units sold = (Day of week) + WeekendFlag + PromotionalFlag + IsSeasonalProduct + IsTop10SellerForseason + IsTop10inOnlinechannel + IsForAllAgegroups + IsforOld + IsforTeens + IsLowAlcholic + IsAllweatherItem + Weatherofday + ProductPriceontheDay + IsthereBundleOffer
Additional Insights of time
‘Year’, ‘Month’, ‘Week’, ‘Day’, ‘Dayofweek’, ‘Dayofyear’, ‘Is_month_end’, ‘Is_month_start’, ‘Is_quarter_end’, ‘Is_quarter_start’, ‘Is_year_end’, and ‘Is_year_start’.
Data Insights
- Aggregate sales by week, day, quarter, holidays, weekends
Handling Missing Data
- Zero filling
- NaN
The weighted quantile loss (wQuantileLoss) calculates how far the forecast is from actual demand in either direction as a percentage of demand on average in each quantile
For the p10 forecast, the true value is expected to be lower than the predicted value 10% of the time
For the p90 forecast, the true value is expected to be lower than the predicted value 90% of the time
Models
- Arima
- prophet
- DeepAR+
- Vector Autoregressive Moving Average with eXogenous regressors model
Link #2 - Time series forecasting
Forecast multiple steps:
- Single-shot: Make the predictions all at once.
- Autoregressive: Make one prediction at a time and feed the output back to the model.
Evaluation of Time Series Forecasting Models for Estimation of PM2.5 Levels in Air
More Reads
Taxonomy of Time Series Forecasting Problems
Time Series Forecasting With Deep Learning: A Survey
Keep Thinking!!!
July 04, 2021
One Liners, Concepts, Slowly Changing Dimensions
SCD Summary
Sometimes one link is good enough to summarize
- Type 1 - Overwrite previous value
- Type 2 - Add new row, Deactive old record, activate new one
- Type 3 - Add new attribute - Activation Data / Effective Date
- Type 4 - Add History Table
Docker - Docker is a tool designed to make it easier to create, deploy, and run applications by using containers
Kubernetes - Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services
Docker vs VM
- In Docker, the containers running share the host OS kernel
- A Virtual Machine, on the other hand, is not based on container technology. They are made up of user space plus kernel space of an operating system
More Reads
Keep Simplifying Concepts!!!
July 02, 2021
Learning vs Knowing vs Experimenting Vs Measure of Skills
A project work X needs 10 different things
- 4 Things you worked in multiple projects, You know how it works
- 3 things you did a hello world and you know basics
- 3 things you read up stack overflow and fill the gaps
The goal is to get a working implementation of the idea. You know few things but didn't deep dive. You implemented few things and did a deep dive as you worked on it in multiple projects.
We may not master all 10 or remember all 10, We cannot wait to master all 10 to build our idea. The measure of knowledge is the ability to experiment, build, it's not just familiarity with all 10 tools or technology. Time to change the perspective we look at skills.
Keep Thinking!!!