Optimization helps us to master the training. It tells us why those simple method (e.g., SGD) can find good parameters.
Statistic characterizes the generalization property of the learning scheme.
Geometry and Algebra provide a fundamental tool for understanding a machine learning process.