Lessons learnt when doing theory
- When reading an important theory paper, prioritize deep understanding over speed.
- Work through the details by hand, and use the simplest examples to understand the theory.
- Read few classical papers thoroughly instead of reading many unfiltered papers superficially.
- I find that I prefer an experimental (scientific) approach to understanding concepts, rather than focusing too heavily on mathematical techniques. This allows me to use my time more effectively. Instead of spending too much time learning advanced methods like DMFT, I aim to gain insights by running experiments and designing simple sandboxes to explore ideas.
Several classical papers I plan to completely go through in the spring semester:
- Gradient Descent Provably Optimizes Over-parameterized Neural Networks. Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh.
- On Lazy Training in Differentiable Programming. Lenaic Chizat, Edouard Oyallon, Francis Bach.
- Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit. Song Mei, Theodor Misiakiewicz, Andrea Montanari.
- Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification. Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford.
