- CNN - all the outputs are self dependent
- CNN/FF-Nets all the outputs are self dependent Feed-forward nets don’t remember historic input data
- RNN - Hidden state memory, correlation between previous input to the next input, Cell state, Forget Gate
- RNN - learn to keep only relevant information to make predictions and forget non relevant data RNN - Conveyer belt
- RNN Perform well when the input data is interdependent in a sequential pattern correlation between previous input to the next input introduce bias based on your previous output
Transformer
- Positional embeddings - the order and position of words in a sequence
- Self attention - allows each token to dynamically weigh and integrate information from all other positions
- The self-attention mechanism is a type of attention mechanism which allows every element of a sequence to interact with every others and find out who they should pay more attention to.
- Multi-head attention runs multiple self-attention processes in parallel, capturing diverse aspects of the data
Keep Exploring!!!
No comments:
Post a Comment