Author: Arnaud Autef

<aside> 💡 In this reading group session, we present the 2021 Deep Learning (DL) paper "Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data".

In this empirical paper, authors argue that simple neural networks architectures can reach state-of-the-art performance for supervised learning on tabular data.

It goes against the common wisdom that Gradient Boosting with Decision Trees (GBDT) are superior to DL approaches.

The main insight of the paper is that regularization techniques are key to unlock higher performance with neural networks. As long as a broad array of regularization approaches are considered during hyperparameters optimization, neural networks should prevail.

In this session we,

</aside>

Contents

1 - What is Deep Learning?

https://twitter.com/ylecun/status/1209497021398343680?lang=fr

Unfortunately, and as we can read above, DL is not quite precisely defined, and we are not going to try and define it!

Here, we restrict ourselves to Multi-Layer-Perceptrons (MLPs)

What is an MLP?

Setup

We consider a supervised regression setting with dataset $\mathcal{D} = (x_i, y_i)_{1 \le i \le n}$ where