Despite the tremendous success of deep learning, neural network-based models are highly susceptible to small, imperceptible, adversarial perturbations of data at test time. Such vulnerability to adversarial examples imposes severe limitations on deploying neural networks-based systems, especially in critical, high-stakes applications such as autonomous driving, where safe and reliable operation is paramount. In this talk, we seek to understand why trained neural networks classify clean data with high accuracy yet remain extraordinarily fragile due to strategically induced perturbations. Further, we give a first-of-its-kind computational guarantee for adversarial training, which formulates robust learning as a min-max optimization problem and has emerged as a principled approach to training models that are robust to adversarial examples.