Joan Bruna; Wojciech Zaremba; Arthur Szlam; Yann LeCun ICLR 2014 - Spectral Networks and Locally Connected Networks on Graphs

**Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:**

graph LR
classDef limitations fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef goal fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef similarity fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef locally fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef challenge fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef preliminary fill:#f9d4d4, font-weight:bold, font-size:14px;
A[Joan Bruna et al] --> B[Convolutional networks successful for

grid-structured data. 1] A --> C[Limitations exist for non-grid data. 2] A --> D[Goal: input size-independent

parameters. 3] A --> E[Graph convolution via

Laplacian eigenvectors. 6] A --> F[Preliminary results validate

the approach. 12] A --> G[Challenge: relating frequencies,

defining smoothness. 11] B --> H[Similarity from sensing

or data statistics. 4] B --> I[Locally connected networks

learn neighborhoods. 5] E --> J[Linear operator commuting

with Laplacian. 7] E --> K[Spectral learning needs

input-size parameters. 8] E --> L[Localization, transforms

suggest smooth filters. 9] E --> M[Smooth filters achieve

constant parameters. 10] F --> N[Spectral CNN reduces

parameters, maintains performance. 13] N --> O[Learned maps complementary,

input-region-specific. 14] G --> P[1D frequency ordering

works, similarity open. 11] G --> Q[Fourier transform expensive

vs MFT. 16] G --> R[Irregular graph handling

needed beyond examples. 17] D --> S[First step: exploit

geometry for parameters. 15] D --> T[Open question: optimal

frequency arrangement. 18] D --> U[Symmetries, structure should

inform parameters. 19] D --> V[More work needed

on optimal parameters. 20] class C limitations; class D goal; class H similarity; class I locally; class G,P,Q,R challenge; class F,N,O preliminary;

grid-structured data. 1] A --> C[Limitations exist for non-grid data. 2] A --> D[Goal: input size-independent

parameters. 3] A --> E[Graph convolution via

Laplacian eigenvectors. 6] A --> F[Preliminary results validate

the approach. 12] A --> G[Challenge: relating frequencies,

defining smoothness. 11] B --> H[Similarity from sensing

or data statistics. 4] B --> I[Locally connected networks

learn neighborhoods. 5] E --> J[Linear operator commuting

with Laplacian. 7] E --> K[Spectral learning needs

input-size parameters. 8] E --> L[Localization, transforms

suggest smooth filters. 9] E --> M[Smooth filters achieve

constant parameters. 10] F --> N[Spectral CNN reduces

parameters, maintains performance. 13] N --> O[Learned maps complementary,

input-region-specific. 14] G --> P[1D frequency ordering

works, similarity open. 11] G --> Q[Fourier transform expensive

vs MFT. 16] G --> R[Irregular graph handling

needed beyond examples. 17] D --> S[First step: exploit

geometry for parameters. 15] D --> T[Open question: optimal

frequency arrangement. 18] D --> U[Symmetries, structure should

inform parameters. 19] D --> V[More work needed

on optimal parameters. 20] class C limitations; class D goal; class H similarity; class I locally; class G,P,Q,R challenge; class F,N,O preliminary;

**Resume: **

**1.-**Convolutional networks are successful for images and sounds due to grid structure, local statistics, and parameter efficiency.

**2.-**Limitations exist for non-grid data like 3D meshes, spectrograms, social networks, and across channels in standard architectures.

**3.-**Goal is to learn layers where parameter count is independent of input size by treating signals as functions on graphs.

**4.-**Similarity between features can come from sensing process (e.g. 3D mesh distances) or be estimated from data statistics.

**5.-**Locally connected networks learn neighborhoods to capture local correlation, then reduce graph resolution and repeat, but still scale with size.

**6.-**Convolution on graphs can be defined through Laplacian eigenvectors that generalize the Fourier basis.

**7.-**Convolution is defined as any linear operator that commutes with the Laplacian, i.e. diagonal in the Laplacian eigenbasis.

**8.-**Learning filter coefficients directly in this spectral domain still requires parameter count proportional to input size.

**9.-**Analogy between spatial localization of signals and smoothness of their Fourier transforms suggests learning smooth spectral filters.

**10.-**Smooth spectral filters with finite spatial support require parameters proportional only to filter size, achieving constant parameter count.

**11.-**Challenge is relating frequencies to define spectral smoothness; even 1D frequency ordering works but optimal similarity is an open problem.

**12.-**Preliminary results on subsampled MNIST and MNIST projected on 3D sphere validate the approach.

**13.-**Spectral CNN reduces parameters by 1-2 orders of magnitude vs fully connected nets without sacrificing performance.

**14.-**Learned spectral feature maps are complementary and concentrate energy in different input regions.

**15.-**First step in exploiting input geometry to learn with parameter count independent of input size.

**16.-**Computing Fourier transform is still expensive compared to other methods like MFT.

**17.-**Need to handle highly irregular graphs beyond simple examples shown.

**18.-**Open question on how to optimally arrange frequencies in dual domain for effective spatial localization.

**19.-**Symmetries and structure in original data should inform parameter allocation in neural networks.

**20.-** Much more work needed on optimally allocating network parameters by exploiting input structure and symmetries.

Knowledge Vault built byDavid Vivancos 2024