"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 26, 2021

Activation Functions - Observations

Same experiment, different activation functions, different observations

  • The number of epochs to arrive at similar accuracy - Tanh / Relu seem to perform significantly better than sigmoid. In the case of huge datasets, It makes sense to pick those activation functions which perform better and learn faster.
  • Boundary types between each activation function  - Although every activation function solves it you can spot the boundary circular / boxed based on their behavior. 
  • The training accuracy with 880 Epochs in Sigmoid = 90 Epochs of Tanh = 78 Epochs of Relu. Every activation function will converge but the number of epochs depends on choosing the right activation function to save compute, faster training



Keep Exploring!!!

No comments: