Dynamics of Neural Networks

These papers are concerned with the optimization trajectory of neural networks. We develop new methods for optimizing complex neural networks that differ from naive stochastic gradient descent. We also exploit the optimization trajectory to provide interpretation of the neural network's predictions.

Small-data and Efficient Learning

These papers are concerned with learning from a small amount of training data. While excellent performance has been derived from scaling up training data and network sizes, training data of many tasks remain difficult to acquire. Thus, it is important to explore how to learn in the small-data / few-shot / zero-shot regimes, especially with the help of large networks.