A Crash Course on Wide Neural Nets and the Neural Tangent Kernel
How do neural networks behave when the hidden layers are very large? A new research direction in machine learning studies network behavior in the limit that the size of each hidden layer increases to infinity. Surprisingly, in this limit the behavior of the network dramatically simplifies and we can understand what the network is doing theoretically. Why does this happen and what can we say?
The fundamental reason that wide networks have a simpler behavior is that they are close to their linear approximation: