Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Attention

January 04, 2022

Attention - Lessons

From Application Developer vs Knowing how it works, Still trying to figure out, Be it backpropagation or Network design or Attention.

Wonderful thread

"Attention is all you need" implementation from scratch in PyTorch. A Twitter thread:
1/
— abhishek (@abhi1thakur) December 13, 2021

Summarizing my lessons

Lesson #1 - Encoder takes embeddings and source masks
Lesson #2 - Decoder takes target embedding and target masks
Lesson #3 - Encoder has sequence of encoders one connected to each other, Encoder1 -> Output -> Encoder2 -> ... Encoder N
Lesson #4 - Encoder N will be connected to Decoder 1, Decoder has several Decoder Layers
Lesson #5 - Encoder contains one sequential layer + attention + feedforward layer
Lesson #6 - In RNN when we read we remember input gate, foreget gate, history gate, output gate. Something here you have the connection to self called self attention, Something like keeping the sequence history
Lesson #7 - Multihead attention = Multiple self attention layers
Lesson #8 - Self attention = attention to remember the same sequences, 1-2-3,1-2-3,Again a percentage of sequence might be picked up as historical info which may influnce the next token prediction
Lesson #9/#10/#11 - Forward function = Softmax + matrix multiplcation
Lesson #12/#13/#14 - Decoder has similar attention layer, multihead self attention
Lesson #15/#16 - Padding, Positional encoding with embedding layer
Lesson #17/#18 - linear + softmax to decoder output

Keep Exploring!!!

No comments:

About Me and Disclaimer

Welcome Visitor,
I have 20 years of experience (Coder - Emprical Learner - Teacher). I am currently working on Data Analytics (Video-Image-Text-Data) / Database / BI space. I dabble with "Data". Ping me or send a request to connect if what I do appeals to you and you want to talk about it (Data Science / Databases / Deep Learning / Architecture / Design Discussions / Consulting Projects/ Machine Learning Training's/ Strategic Leadership Roles).
Personal Goal - Reach / Teach up to 10 Million Students through various mediums (Catalyst between Academics and Industry)
My request to readers, Hope you find the posts, code snippets, notes helpful, please share your learning with others. We can only grow only by learning and teaching.

6+ years in AI, AI experience working on Image, Video, Text, Numbers - Data

15+ years in Databases

10+ in developing, deploying, monitoring large scale solutions in Supply Chain, Retail

Its my personal blog. The objective of this blog is to bookmark/share my learning's. Posts reflect my opinions, perspectives and interests. Blog post presented are my personal views and do not represent my employer's view. I have acknowledged all posts with References/Bookmarks.

For questions/feedback/career opportunities/training / consulting assignments/mentoring - please drop a note to sivaram2k10(at)gmail(dot)com
Coach / Code / Innovate

A blogpost a day keeps your thinking going.

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

January 04, 2022

Attention - Lessons

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts