"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

October 03, 2022

Backpropagation Notes - Forward propagation, Backward Propagation, Optimizers Notes

Backpropagation  - The amount of error in the neurons in the output layer is propagated back to the preceeding layers

Optimization algorithms are used to find the optimum parameters/variables of the NNs

  • SGD is an algorithm that randomly selects a few samples instead of the whole data
  • AdaGrad is a modified SGD that improves convergence performance over standard SGD algorithm
  • RMSProp is an optimization algorithm that provides the adaptation of the learning rate for each of the parameters.
  • ADAM combines advantages of the RMSProp (works well in online and non-stationary settings) and AdaGrad (works well with sparse gradients)

RNN

  • With the BPTT learning method, the error change at any t time is reflected in the input and weights of the previous t times
  • The difficulty of training RNN is due to the fact that the RNN structure has a backward dependence over time.

Hyperparameters - The number of hidden layers, the number of units in each layer, regularization techniques, network weight initialization, activation functions, learning rate, momentum values, number of epochs, batch size (minibatch size), decay rate, optimization algorithms

Ref - Link

Keep Exploring!!!

No comments: