Leadership, entrepreneurship, and AI insights

  • LinkedIn Social Icon
  • YouTube Social  Icon
  • Twitter Social Icon
  • Facebook Social Icon
  • Instagram Social Icon
  • SoundCloud Social Icon
  • Spotify Social Icon

© 2018 by Robin.ly. Provided by CrossCircles Inc.

[NeurIPS 2019 Highlight] Daniel Levy @ Stanford: Sufficient Geometries for Gradient Methods


This episode is an interview with Daniel Levy from Stanford University, discussing highlights from his paper, "Necessary and Sufficient Geometries for Gradient Methods," accepted as an oral presentation at NeurIPS 2019 Conference.



Daniel is a second-year Ph.D. student in Computer Science at Stanford University advised by John Duchi. His research interests are in optimization and machine learning. Previously, he was an M.S. student at Stanford University advised by Stefano Ermon, working on probabilistic models and reinforcement learning. He completed his undergraduate studies at Lycée Louis-Le-Grand and Ecole Polytechnique from which he obtained a B.S. and a M.S. in 2014 and 2015. He spent an internship at Google Brain in 2017 where he worked with Jascha Sohl-Dickstein and Matt Hoffman.



Interview with Robin.ly:




Robin.ly is a content platform dedicated to helping engineers and researchers develop leadership, entrepreneurship, and AI insights to scale their impacts in the new tech era.


Subscribe to our newsletter to stay updated for more NeurIPS interviews and inspiring AI talks:


Paper At A Glance


We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex—for example, any `p-ball for p < 2—the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.


[Poster]

[Full Paper]



Daniel Levy at his NeurIPS Presentation