Knowledge Vault 2/8 - ICLR 2014-2023
Joan Bruna; Wojciech Zaremba; Arthur Szlam; Yann LeCun ICLR 2014 - Spectral Networks and Locally Connected Networks on Graphs
<Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:

graph LR classDef limitations fill:#f9d4d4, font-weight:bold, font-size:14px; classDef goal fill:#f9d4d4, font-weight:bold, font-size:14px; classDef similarity fill:#f9d4d4, font-weight:bold, font-size:14px; classDef locally fill:#f9d4d4, font-weight:bold, font-size:14px; classDef challenge fill:#f9d4d4, font-weight:bold, font-size:14px; classDef preliminary fill:#f9d4d4, font-weight:bold, font-size:14px; A[Joan Bruna et al] --> B[Convolutional networks successful for
grid-structured data. 1] A --> C[Limitations exist for non-grid data. 2] A --> D[Goal: input size-independent
parameters. 3] A --> E[Graph convolution via
Laplacian eigenvectors. 6] A --> F[Preliminary results validate
the approach. 12] A --> G[Challenge: relating frequencies,
defining smoothness. 11] B --> H[Similarity from sensing
or data statistics. 4] B --> I[Locally connected networks
learn neighborhoods. 5] E --> J[Linear operator commuting
with Laplacian. 7] E --> K[Spectral learning needs
input-size parameters. 8] E --> L[Localization, transforms
suggest smooth filters. 9] E --> M[Smooth filters achieve
constant parameters. 10] F --> N[Spectral CNN reduces
parameters, maintains performance. 13] N --> O[Learned maps complementary,
input-region-specific. 14] G --> P[1D frequency ordering
works, similarity open. 11] G --> Q[Fourier transform expensive
vs MFT. 16] G --> R[Irregular graph handling
needed beyond examples. 17] D --> S[First step: exploit
geometry for parameters. 15] D --> T[Open question: optimal
frequency arrangement. 18] D --> U[Symmetries, structure should
inform parameters. 19] D --> V[More work needed
on optimal parameters. 20] class C limitations; class D goal; class H similarity; class I locally; class G,P,Q,R challenge; class F,N,O preliminary;

Resume:

1.-Convolutional networks are successful for images and sounds due to grid structure, local statistics, and parameter efficiency.

2.-Limitations exist for non-grid data like 3D meshes, spectrograms, social networks, and across channels in standard architectures.

3.-Goal is to learn layers where parameter count is independent of input size by treating signals as functions on graphs.

4.-Similarity between features can come from sensing process (e.g. 3D mesh distances) or be estimated from data statistics.

5.-Locally connected networks learn neighborhoods to capture local correlation, then reduce graph resolution and repeat, but still scale with size.

6.-Convolution on graphs can be defined through Laplacian eigenvectors that generalize the Fourier basis.

7.-Convolution is defined as any linear operator that commutes with the Laplacian, i.e. diagonal in the Laplacian eigenbasis.

8.-Learning filter coefficients directly in this spectral domain still requires parameter count proportional to input size.

9.-Analogy between spatial localization of signals and smoothness of their Fourier transforms suggests learning smooth spectral filters.

10.-Smooth spectral filters with finite spatial support require parameters proportional only to filter size, achieving constant parameter count.

11.-Challenge is relating frequencies to define spectral smoothness; even 1D frequency ordering works but optimal similarity is an open problem.

12.-Preliminary results on subsampled MNIST and MNIST projected on 3D sphere validate the approach.

13.-Spectral CNN reduces parameters by 1-2 orders of magnitude vs fully connected nets without sacrificing performance.

14.-Learned spectral feature maps are complementary and concentrate energy in different input regions.

15.-First step in exploiting input geometry to learn with parameter count independent of input size.

16.-Computing Fourier transform is still expensive compared to other methods like MFT.

17.-Need to handle highly irregular graphs beyond simple examples shown.

18.-Open question on how to optimally arrange frequencies in dual domain for effective spatial localization.

19.-Symmetries and structure in original data should inform parameter allocation in neural networks.

20.- Much more work needed on optimally allocating network parameters by exploiting input structure and symmetries.

Knowledge Vault built byDavid Vivancos 2024