The End Of Knowledge - Vault 2 - ICLR (2014-2023) - Krystal Maughan et al ICLR 2023

graph LR classDef tiny fill:#f9d4d4, font-weight:bold, font-size:14px; classDef formats fill:#d4f9d4, font-weight:bold, font-size:14px; classDef learning fill:#d4d4f9, font-weight:bold, font-size:14px; classDef improvements fill:#f9f9d4, font-weight:bold, font-size:14px; classDef reasoning fill:#f9d4f9, font-weight:bold, font-size:14px; A[Tiny Papers Showcase Day
ICLR 2023] --> B[Tiny Papers: impactful format
for early researchers. 1] B --> C[200+ submissions, required
meta-reviewers, chairs. 2] B --> D[Goals: feedback, archiving,
community building. 3] B --> E[Accepted papers: 4
decision categories. 4] B --> F[Poster sessions, breaks,
flash orals, dinner. 5] A --> G[Quantum federated learning:
clients' quantum capabilities. 6] A --> H[Point-to-Sequence Soft Attention:
improves vision-language tasks. 7] A --> I[SIMBA-ML: model-informed ML
using synthetic data. 8] A --> J[Sparse-GPT: efficient
neural network pruning. 9] A --> K[Hippocampus theta sequences
enable credit assignment. 10] A --> L[Multi-channel graph attention:
handles multi-channel data. 11] A --> M[SoftEDA: label smoothing
for text augmentation. 12] A --> N[FitKernel: improves graph
convolutional networks' transferability. 13] A --> O[3D pose estimation: metric
miscalibrates predictive distributions. 14] K --> P[Theta sequences compress states
for synaptic eligibility traces. 15] H --> Q[Compound token embedding
improves multimodal fusion. 16] A --> R[Tiny attention: alternative
to transformer attention. 17] A --> S[Decompositions reveal unfairness
sources in models. 18] A --> T[SPVec: unsupervised learning
of word associations. 19] A --> U[LLMs engage in
diagnostic reasoning via QA. 20] A --> V[Geodesic mode connectivity:
low-loss paths between networks. 21] A --> W[Fast Fourier convolutions
improve image denoising. 22] A --> X[YOLOv5 tuned for
incomplete histopathology annotations. 23] A --> Y[Generative methods probe
classifier differences. 24] A --> Z[Compressed federated learning
updates improve communication. 25] A --> AA[Pre-trained ensemble beats
single models despite shift. 26] A --> AB[Personalized federated learning
with hypernetworks. 27] A --> AC[IVAs: polarized regime,
implications for disentanglement. 28] A --> AD[MetaXL: meta-learning from
multiple source languages. 29] A --> AE[Key takeaways: format success,
work diversity, ML improvements. 30] class A,B,C,D,E,F tiny; class G,H,I,J,K,L,M,N,O,P,Q,R,S,T learning; class U,V,W,X,Y,Z,AA,AB,AC,AD improvements; class AE formats;

Resume:

1.-Rosanne, Krystal, and Tom organized Tiny Papers initiative to provide impactful alternative format for early stage researchers to engage with ICLR community.

2.-Tiny Papers had over 200 submissions, requiring recruitment of meta-reviewers, area chairs, and emergency reviewers to handle volume.

3.-Goals were to provide feedback to junior researchers, archive their work, and build community. Submissions spanned many machine learning topics.

4.-For accepted papers, 4 decision categories: invite to present (notable), invite to present, invite to archive, invite to revise.

5.-Schedule includes poster sessions, breaks for discussion, flash orals of 3 minutes each, and a dinner for in-person attendees.

6.-Quantum federated learning involves clients with quantum computing capabilities. Paper proposes post-quantum cryptography signature scheme and dynamic server selection to address security/failure risks.

7.-Point-to-Sequence Soft Attention adds multi-head cross attention to combine visual and text representations, improving over concatenation and co-attention in vision-language tasks.

8.-SIMBA-ML Python framework provides toolbox for model-informed machine learning using simulation results from differential equations to generate synthetic data.

9.-Pruning neural networks iteratively using Sparse-GPT method allows finding optimal subnetworks faster and without expensive resampling compared to magnitude pruning.

10.-Hippocampus place cells exhibit theta sequences - decoded position sweeps from behind to ahead of animal, enabling efficient credit assignment across compressed states.

11.-Multi-channel graph attention uses multiple attentions, one per graph feature channel, to handle multi-channel graph data, wrapped by encoder/decoder for efficiency.

12.-SoftEDA applies label smoothing to augmented examples from EDA text data augmentation, improving model performance on text classification tasks.

13.-FitKernel uses parallel sparse convolutional kernels to increase receptive field and improve transferability of graph convolutional networks for non-IID node feature distributions.

14.-In lifting-based 3D human pose estimation, commonly used minimum mean per joint position error metric leads to miscalibrated predictive distributions.

15.-Theta sequences in hippocampus enable credit assignment for reward learning by compressing experienced states to match short synaptic eligibility traces.

16.-Compound token embedding using cross-attention improves multimodal fusion in vision-language models compared to concatenation and co-attention.

17.-Tiny attention uses SVD of asymmetric word co-occurrence matrix to learn contextual word vectors as an alternative to transformer attention.

18.-Decompositions of causality and fairness metrics reveal impact of variable dependencies and allow diagnosing sources of unfairness in models.

19.-Unsupervised syntagmatic paradigmatic word embeddings (SPVec) learn word associations; improves contextual embeddings by selecting associated context words to disambiguate meaning.

20.-Large language models can engage in multi-step diagnostic reasoning via question-answering with patients when prompted with exemplars of the reasoning process.

21.-Geodesic mode connectivity identifies low-loss paths between independently trained narrow neural networks by optimization in the space of output distributions.

22.-Fast Fourier convolutions can improve self-supervised image denoising, especially for images with sharp contrasting edges like Chinese characters.

23.-Object detection models like YOLOv5 can be tuned to handle incomplete annotations in histopathology data, improving performance with less labeled data.

24.-Generative methods can probe differences between comparably performing classifiers by optimizing for input data points that maximize prediction divergence.

25.-Compressing model updates in federated learning based on their underlying information structure using quantization, entropy coding, and run-length encoding improves communication efficiency.

26.-An ensemble of pre-trained models outperforms single models in classifying breast cancer stage from histopathology images despite domain shift.

27.-Personalized federated learning using hypernetworks to generate client-specific model weights improves segmentation performance in a multi-hospital collaboration scenario.

28.-IVAs exhibit a polarized regime, where active latent variables determine reconstructions while passive variables collapse to the prior; has implications for disentanglement.

29.-MetaXL enhances cross-lingual transfer by meta-learning from multiple source languages, using multi-armed bandits to sample harder source languages for better generalization.

30.-Main takeaways include success of tiny papers format, diversity of presented work, and importance of efficient, secure, explainable and generalizable ML methods.

Knowledge Vault built byDavid Vivancos 2024