Pedro Domingos ICLR 2014 - Invited Talk - Symmetry-Based Learning

**Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:**

graph LR
classDef symmetry fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef learning fill:#d4f9d4, font-weight:bold, font-size:14px;
classDef applications fill:#d4d4f9, font-weight:bold, font-size:14px;
classDef benefits fill:#f9f9d4, font-weight:bold, font-size:14px;
classDef next fill:#f9d4f9, font-weight:bold, font-size:14px;
A[Pedro Domingos

ICLR 2014] --> B[Symmetry group theory: foundation

for machine learning. 1] A --> E[Function symmetry: input change,

same output. 4] A --> F[Symmetry underused in ML,

powerful in math/physics. 5] A --> I[Benefits: sample efficiency, generalization,

deep learning, etc. 8] A --> U[Symmetry foundational alongside probability,

logic, optimization. 20] A --> Z[Symmetry group theory powerful

for representation learning. 25] B --> C[Object symmetry: transformation

mapping to itself. 2] C --> D[Continuous symmetries: Lie groups,

e.g. Euclidean space. 3] E --> G[Classifier symmetry: input change,

same class. 6] G --> H[Learn target function's symmetries

to trivialize it. 7] F --> J[ConvNets: limited case of

feature map translation. 9] J --> K[Deep Affine Nets: generalize

ConvNets with affine group. 10] K --> L[Deep Affine Net layer:

apply affine transforms, pool. 11] K --> M[Efficient interpolation, nearest neighbor

search for feasibility. 12] K --> N[Deep Affine Nets outperform ConvNets

on rotated MNIST. 13] K --> O[Next steps: more layers, richer

distortions, real images. 14] F --> P[Semantic parsing: sentences to

logical formulas. 15] P --> Q[Sentence orbits under transformations

correspond to meanings. 16] Q --> R[Semantic parser finds most

probable sentence orbit. 17] P --> S[Learning discovers symmetries

from sentence pairs. 18] P --> T[Logical inference rules: symmetries

of knowledge bases. 19] P --> V[Machine translation could help

with paraphrase data. 21] U --> W[Symmetry algebra general,

relates to network depth. 22] U --> X[Strong symmetries make data

points equivalent, abstract. 23] U --> Y[Fuzzy invariance preserves discrimination,

via probabilistic orbits. 24] class B,C,D,E,F,G,H,U,W,X,Y,Z symmetry; class I,V learning; class J,K,L,M,N,O,P,Q,R,S,T applications; class I benefits; class O,V next;

ICLR 2014] --> B[Symmetry group theory: foundation

for machine learning. 1] A --> E[Function symmetry: input change,

same output. 4] A --> F[Symmetry underused in ML,

powerful in math/physics. 5] A --> I[Benefits: sample efficiency, generalization,

deep learning, etc. 8] A --> U[Symmetry foundational alongside probability,

logic, optimization. 20] A --> Z[Symmetry group theory powerful

for representation learning. 25] B --> C[Object symmetry: transformation

mapping to itself. 2] C --> D[Continuous symmetries: Lie groups,

e.g. Euclidean space. 3] E --> G[Classifier symmetry: input change,

same class. 6] G --> H[Learn target function's symmetries

to trivialize it. 7] F --> J[ConvNets: limited case of

feature map translation. 9] J --> K[Deep Affine Nets: generalize

ConvNets with affine group. 10] K --> L[Deep Affine Net layer:

apply affine transforms, pool. 11] K --> M[Efficient interpolation, nearest neighbor

search for feasibility. 12] K --> N[Deep Affine Nets outperform ConvNets

on rotated MNIST. 13] K --> O[Next steps: more layers, richer

distortions, real images. 14] F --> P[Semantic parsing: sentences to

logical formulas. 15] P --> Q[Sentence orbits under transformations

correspond to meanings. 16] Q --> R[Semantic parser finds most

probable sentence orbit. 17] P --> S[Learning discovers symmetries

from sentence pairs. 18] P --> T[Logical inference rules: symmetries

of knowledge bases. 19] P --> V[Machine translation could help

with paraphrase data. 21] U --> W[Symmetry algebra general,

relates to network depth. 22] U --> X[Strong symmetries make data

points equivalent, abstract. 23] U --> Y[Fuzzy invariance preserves discrimination,

via probabilistic orbits. 24] class B,C,D,E,F,G,H,U,W,X,Y,Z symmetry; class I,V learning; class J,K,L,M,N,O,P,Q,R,S,T applications; class I benefits; class O,V next;

**Resume: **

**1.-**The talk presents work in progress on using symmetry group theory as a foundation for machine learning, especially representation learning.

**2.-**In geometry, an object's symmetry is a transformation mapping the object to itself. Symmetries can be composed and satisfy group axioms.

**3.-**Continuous symmetry groups are called Lie groups. An example is the group of symmetries of Euclidean space, which preserve distances between points.

**4.-**A symmetry of a function is an input change that doesn't change the output. This is the key notion for learning representations.

**5.-**Symmetry is powerful in mathematics, physics, search/optimization, model tracking in vision, lifted probabilistic inference, but underused so far in machine learning.

**6.-**A symmetry of a classifier is an input representation change preserving the class. Important variations are the targets; unimportant ones are their symmetries.

**7.-**Learning and composing a target function's symmetries to trivialize it is the goal. This enables learning good representations with less data.

**8.-**Benefits: reduces sample complexity, generalizes algorithms, leads to formal results, enables deep learning via composition, applies across learning paradigms.

**9.-**ConvNets are a limited case translating a feature map over an image. Replacing translation by arbitrary symmetry groups generalizes them significantly.

**10.-**The affine group of linear image transformations is a natural next step after translation, including rotations, reflections, scaling. This yields Deep Affine Networks.

**11.-**One Deep Affine Network layer applies every affine transform to the image, computes features on each, and pools over local affine neighborhoods.

**12.-**Computational feasibility requires interpolating features between computed ones at control points and using nearest neighbor search. Ball trees enable this efficiently.

**13.-**On rotated MNIST digits, 1-layer Deep Affine Nets greatly outperform 1-layer ConvNets with less data by handling rotations directly vs approximately.

**14.-**Next steps: more layers, richer distortions, real-world images; develop subproduct nets; combine with capsule theory for part-whole composition.

**15.-**Semantic parsing maps sentences to logical formulas. Symmetries (synonyms, paraphrases, active/passive voice, etc.) preserve a sentence's meaning.

**16.-**Sentence orbits under syntactic transformations correspond 1-1 with meanings, avoiding explicitly representing meanings and allowing easier learning from sentence pairs.

**17.-**A semantic parser finds the most probable sentence orbit (meaning). New meanings create new orbits. This can be done efficiently using orbit composition structure.

**18.-**Learning discovers the symmetries (semantic parser) from sentence pairs. The goal is a minimal generating set to efficiently cover complex symmetries.

**19.-**Logical inference rules are symmetries of knowledge bases, so reasoning may also fit into this symmetry-based semantic parsing framework.

**20.-**Symmetry group theory is foundational for machine learning alongside probability, logic, optimization. Combining them unleashes its potential, as shown in these initial examples.

**21.-**Lack of paraphrase data is an issue, but machine translation corpora could help. Symmetry learning also applies naturally to vision-language connections.

**22.-**Symmetry algebra is general, not requiring perfect symmetries in the domain. It relates to recent work comparing deep vs shallow networks.

**23.-**Strong symmetries make data points equivalent, abstracting away distinguishing information permanently, which is often desirable for clearly separating target classes.

**24.-**Real-world problems often require "fuzzy" invariance, preserving discrimination for the end task. This can be handled via probabilistic orbits or discrimination optimization.

**25.-**In summary, symmetry group theory provides a powerful foundation for representation learning and extends naturally to many areas, with rich potential.

Knowledge Vault built byDavid Vivancos 2024