Author: ****Arnaud Autef
Contents
<aside> ðŸ’¡ Today, we review SimCLR, a high-performing self-supervised learning algorithm for image data.
SimCLR leverages contrastive learning to learn useful features from vast amounts of unlabeled images.
The SimCLR features allow a mere a linear classifier to reach upwards of 76.5% top1 and 93.2% top5 accuracy on ImageNet.
</aside>
Graph taken from the SimCLR paper. This presents the "Top-1" accuracy of a linear classifier trained on frozen SimCLR features, on the ImageNet dataset.
At a high-level, contrastive learning refers to tasks where
The dataset $\mathcal{D} = \{(q, k^+, k^-)\}$ is a collection of tuples, each tuple containing
The machine learning model consists of an encoder that maps individual data points to a high-dimensional representation
$$ x \in \mathbb{R}^p \mapsto h = f_\theta(x) \in \mathbb{R}^d $$
A scoring function $s$ assigns a similarity to pairs of vectors in the representation space $\mathbb{R}^d$
$$ (h_1, h_2) \in \mathbb{R}^d \times \mathbb{R}^d \mapsto s(h_1, h_2) \in \mathbb{R} $$
The task is to learn a model that assigns larger similarity scores between a query and its corresponding positive example than between a query and its corresponding negative example
$$ \tag{1} \forall (q, x^+, x^-) \in \mathcal{D},~\forall i,~\quad s(f_\theta(q), f_\theta(x^+)) \ge s(f_\theta(q), f_\theta(x^-_i)) $$
<aside> ðŸ’¡ From the above, we need a couple more ingredients to get to practical task and algorithms
1 - A smart definition of queries, positive examples and negative examples from a dataset of unlabeled data points, that "makes" sense and will yield good representations. 2 - A loss function that favors model satisfying the similarity constraints (1)
â†’ Let's see what those are for SimCLR!
</aside>
SimCLR is an instance of Instance-level discrimination: