Knowledge Vault 6 /76 - ICML 2022
G-Mixup: Graph Data Augmentation for Graph Classification
Xiaotian Han · Zhimeng Jiang · Ninghao Liu · Xia Hu
< Resume Image >

Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:

graph LR classDef augmentation fill:#d4f9d4, font-weight:bold, font-size:14px classDef graphTheory fill:#f9d4d4, font-weight:bold, font-size:14px classDef robustness fill:#d4d4f9, font-weight:bold, font-size:14px classDef classification fill:#f9f9d4, font-weight:bold, font-size:14px A[G-Mixup: Graph Data
Augmentation for Graph
Classification] --> B[G-Mixup:
graph augmentation
by interpolation. 1] A --> C[Graphon:
continuous function
for large graphs. 2] A --> D[Graph
classification:
labeling entire
graphs. 3] A --> E[Graph neural
networks:
deep learning
for graphs. 4] A --> F[Data
augmentation:
increase training
data diversity. 5] B --> G[Homomorphism density:
frequency of
subgraph patterns. 6] G --> H[Discriminative motif:
subgraph determining
class. 7] H --> I[Cut norm:
structural
similarity measure. 8] H --> J[Step function:
approximates
graphons. 9] G --> K[Graph generation:
creating synthetic
graphs. 10] C --> L[Manifold intrusion:
synthetic examples
conflict. 11] L --> M[Model robustness:
performance under
perturbations. 12] M --> N[Node/edge perturbation:
modify graph
properties. 13] N --> O[Subgraph sampling:
extract subgraphs. 14] L --> P[Graphon estimation:
infer from data. 15] C --> Q[Weak regularity lemma:
approximate
graphons. 16] Q --> R[Stochastic block model:
random graphs with
communities. 17] R --> S[Graph pooling:
aggregate node
features. 18] R --> T[Mixup:
interpolate features
and labels. 19] S --> U[Label corruption:
random label
changes. 20] A --> V[Topology corruption:
modify graph
structure. 21] V --> W[Open Graph Benchmark:
datasets for graph
tasks. 22] W --> X[Molecular property
prediction:
predict molecule
properties. 23] X --> Y[Graph isomorphism:
structural
equivalence. 24] A --> Z[Batch normalization:
stabilize training. 25] Z --> AA[Dropout:
regularization by
deactivation. 26] AA --> AB[Adam optimizer:
optimization
algorithm. 27] AB --> AC[AUROC:
binary classification
metric. 28] AC --> AD[Statistical significance:
p-values for
results. 29] A --> AE[Hyperparameter sensitivity:
performance with
different
hyperparameters. 30] class B,F,G,H,K augmentation class C,L,M,N,O,P,Q,R,S,T graphTheory class L,M,N,O robustness class W,X,Y classification class V,Z,AA,AB,AC,AD,AE others

Resume:

1.- G-Mixup: A graph data augmentation method that interpolates graphons (graph generators) of different graph classes to create synthetic graphs for training.

2.- Graphon: A continuous function representing the limiting behavior of large graphs, used as a graph generator.

3.- Graph classification: The task of assigning class labels to entire graphs rather than individual nodes.

4.- Graph neural networks (GNNs): Deep learning models designed to process graph-structured data.

5.- Data augmentation: Techniques to artificially increase training data size and diversity to improve model performance and generalization.

6.- Homomorphism density: A measure of the frequency of subgraph patterns in a graph or graphon.

7.- Discriminative motif: The minimal subgraph structure that can determine a graph's class label.

8.- Cut norm: A measure used to quantify the structural similarity between graphons.

9.- Step function: A piecewise constant function used to approximate graphons in practice.

10.- Graph generation: The process of creating synthetic graphs from a graphon or other generative model.

11.- Manifold intrusion: An issue in mixup methods where synthetic examples conflict with original training data labels.

12.- Model robustness: The ability of a model to maintain performance under various perturbations or corruptions of input data.

13.- Node/edge perturbation: Graph augmentation techniques that modify node or edge properties of existing graphs.

14.- Subgraph sampling: A graph augmentation method that extracts subgraphs from larger graph structures.

15.- Graphon estimation: Techniques to infer the underlying graphon from observed graph data.

16.- Weak regularity lemma: A theorem guaranteeing that graphons can be well-approximated by step functions.

17.- Stochastic block model: A probabilistic model for generating random graphs with community structure.

18.- Graph pooling: Methods to aggregate node-level features into graph-level representations for classification tasks.

19.- Mixup: A data augmentation technique that linearly interpolates features and labels between training examples.

20.- Label corruption: A robustness test where a portion of training labels are randomly changed.

21.- Topology corruption: A robustness test where graph structure (edges) is randomly modified.

22.- Open Graph Benchmark (OGB): A collection of benchmark datasets for various graph machine learning tasks.

23.- Molecular property prediction: A graph classification task to predict properties of molecules represented as graphs.

24.- Graph isomorphism: The concept of structural equivalence between graphs, relevant for designing GNN architectures.

25.- Batch normalization: A technique to stabilize neural network training by normalizing layer inputs.

26.- Dropout: A regularization technique that randomly deactivates neural network units during training.

27.- Adam optimizer: A popular optimization algorithm for training neural networks.

28.- Area Under Receiver Operating Characteristic (AUROC): A performance metric for binary classification tasks.

29.- Statistical significance: The use of p-values to determine if observed results are likely due to chance.

30.- Hyperparameter sensitivity: Analysis of how model performance changes with different hyperparameter values.

Knowledge Vault built byDavid Vivancos 2024