4/4 Introduction to Geometric Deep Learning
Introduction to Group Actions
To discuss Geometric Deep Learning, we first need to lay down the basic notions of group actions.
Groups
Definition 1. A group is a pair \((G, \cdot)\) of
- a set \(G\), and
- a binary operation \(G\times G\xrightarrow{(a, b)\mapsto ab}G\),
such that the following conditions are satisfied:
- Associativity: For all \(a,b,c\in G\), we have \((ab)c=a(bc)\).
- Identity element: There exists \(1\in G\) such that for all \(a\in G\), we have \(1a=a1=a\). Note that this in particular implies that the identity element is unique.
- Inverses: For all \(a\in G\), there exists \(a^{-1}\in G\) such that \(aa^{-1}=a^{-1}a=1\). Note that this in particular implies that the inverse of \(a\in G\) is unique.
Example 2. Let \(n\) be a positive integer. Then the permutation group \(S_n\) is the group of bijections \([n]\xrightarrow\pi[n]\) of an \(n\)-element set \([n]=\{0,\dotsc,n-1\}\). The group structure is given by composition.
Example 3. Addition of \(n\)-dimensional vectors equips \(\mathbf R^n\) with a group structure. We denote this by \((\mathbf R^n, +)\).
Group Actions
Definition 4. Let \(G\) be a group and let \(X\) be a set. Then a right action of \(G\) on \(X\) is a map \(X\times G\xrightarrow{(x, a)\mapsto xa}X\) such that the following condition is satisfied:
For \(a,b\in G\) and \(x\in X\), we have \((xa)b=x(ab)\).
In this case, we also call \(X\) a right \(G\)-set.
Example 5. For any group \(G\) and any set \(X\), the trivial action of \(G\) on \(X\) is the action \(X\times G\xrightarrow{(x, a)\mapsto x}X\).
Example 6. The right translation action of the group \(G\) on itself as a set is the action \(G\times G\to G\) given by multiplication.
Example 7. Let \(m,n\) be positive integers. Then the permutation group \(S_m\) acts on the set \(\mathbf R^{m\times n}\) of \(m\times n\) matrices on the right by permuting the rows.
Equivariant and Invariant Maps, Quotients
Definition 8. Let \(G\) be a group, and let \(X\xrightarrow fY\) be a map between right \(G\)-sets. Then we say that \(f\) is \(G\)-equivariant, if for each \((x,a)\in X\times G\), we have \(f(xa)=f(x)a\). If \(Y\) is equipped with the trivial \(G\)-action, then we say that \(f\) is \(G\)-invariant.
Theorem 9. Let \(G\) be a group and \(X\) a right \(G\)-set. Then there exists a set \(X/G\) and an invariant map \(X\xrightarrow qX/G\), such that for all sets \(Y\) and invariant maps \(X\xrightarrow{f}Y\), there exists a unique map of sets \(X/G\xrightarrow gY\) such that we have \(f=g\circ q\).
Definition 10. We call the set \(X/G\) in Theorem 9 the quotient and \(q\) the quotient map or universal invariant map.
Remark 11. The property stated in Definition 10 is a so-called universal property. It implies uniqueness of quotients up to unique isomorphism: if \(X\xrightarrow hZ\) is another quotient, then there exists a unique isomorphism \(X/G\xrightarrow\phi Z\) such that \(h=\phi\circ q\).
Example 12. Let \(\mathbf R^{m\times n}\xrightarrow f\mathbf R^n\) be the map that takes the mean or the maximum of each column. Then it is invariant with respect to the action of \(S_m\) on \(\mathbf R^{m\times n}\) by permuting the rows. One can construct the quotient space \(\mathbf R^{m\times n}/S_m\) as the set of \(m\)-element sets of \(n\)-dimensional vectors.
The Objective of Geometric Deep Learning
Recall that we build models
where \(\mathscr X\) is the feature space, and \(\mathscr Y\) is the target space or the space of distributions of targets. What Geometric Deep Learning is concerned with is how can we make the model respect symmetries. That is, we suppose that \(\mathscr X\) has an action by a group \(G\), and we want \(f\) to be invariant / equivariant (if \(G\) acts on \(\mathscr Y\) also). For example:
-
As introduced in Notebook 0228, a document is a sequence of terms. Previously, we hardcoded how to get feature vectors for documents by either:
- conducting LSA or
- aggregating pretrained word vectors.
We can make our text classification models more sophisticated by viewing documents as sequences of terms. For a first next step, we disregard term order. Therefore, the feature space has a permutation action. This situation will be our focus on next Wednesday.
-
When we're classifying images, then translating an image usually does not change its class. Therefore, the features have an action by translation. Shortly, we'll introduce Convolutional Neural Networks: models that respect translation.
Respecting symmetries brings huge benefits:
- Parameter sharing: If we can systematically exploit that multiple inputs can be viewed as being the same, then we need much less parameters in our models.
- Pseudo-augmented training: If the model is designed in a way that respects symmetries, then processing a training sample amounts to processing many equivalent samples with respect to symmetry.
A work-in-progress book and additional resources on Geometric Deep Learning are available at https://geometricdeeplearning.com/