What makes non-convex optimization hard?
- Potentially many local minima
- Saddle points
- Very flat regions
- Widely varying curvature
- Matrix completion, principal component analysis
- Low-rank models and tensor decomposition
- Maximum likelihood estimation with hidden variables
- Usually non-convex
- The big one: deep neural networks
- Stochastic gradient descent
- Mini-batching
- SVRG
- Momentum
Happy Mastering DL!!!
No comments:
Post a Comment