Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Speaker: Stanislav Fort (Stanford)

When we train a deep neural network on a dataset using gradient descent, we are exploring an extremely high-dimensional landscape of weight configurations looking for a rare solution to our task, while using only the local gradients as a guide. Given how complicated these landscape can be, how exactly do deep neural networks manage to converge to good, generalizable solutions at all, and can we say anything more concrete about the types of landscapes they navigate during training? In this talk, I will focus on recent geometric insights into the structure of neural network loss landscapes -- I will discuss a phenomenological approach to modelling their large-scale structure [1,2], and its consequences for ensembling, calibration, uncertainty estimates and Bayesian methods in general [3]. I will conclude with an outlook on several interesting open questions in understanding artificial deep networks.

  1. Fort, Stanislav, and Adam Scherlis. “The Goldilocks zone: Towards better understanding of neural network loss landscapes.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. arXiv 1807.02581

  2. Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv 1906.04724

  3. S Fort, H Hu, B Lakshminarayanan. “Deep Ensembles: A Loss Landscape Perspective.” arXiv 1912.02757

...