I think you missed the point. Proving something is optimal, is a much higher bar than just knowing how the hell the algorithm gets from inputs to outputs in a reasonable way. Even concurrent systems and algorithm bounds under input distributions have well established ways to evaluate them. There is literally no theoretical framework for how a neural network churns out answers from inputs, other than the most fundamental "matrix algebra". Big O, Theta, Omega, and asymptotic performance are all sound theoretical methods to evaluate algorithms. We don't have anything even that good for neural networks.