Knowledge Vault 5 /46 - CVPR 2019
Meta-Learning With Differentiable Convex Optimization
Kwonjoon Lee; Subhransu Maji; Avinash Ravichandran; Stefano Soatto
< Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef main fill:#f9d4d4, font-weight:bold, font-size:14px classDef fewshot fill:#d4f9d4, font-weight:bold, font-size:14px classDef metalearning fill:#d4d4f9, font-weight:bold, font-size:14px classDef results fill:#f9f9d4, font-weight:bold, font-size:14px A[Meta-Learning With Differentiable
Convex Optimization] --> B[Few-shot: Generalize from few samples. 1] A --> C[Meta-learning: Embeddings generalize across tasks. 3] C --> D[Process: Model, meta-loss, backpropagate, meta-learn. 4] A --> E[Linear predictors: SVM, regression in network. 5] E --> F[SVM beats nearest neighbor: Adaptive, scalable. 6] E --> G[SVM gradient: Closed-form, leveraging convexity. 7] E --> H[Dual formulation: Computational efficiency, linear combination. 8] B --> I[Prototypical nets: Average, classify by prototype. 2] A --> J[Results] J --> K[miniImageNet, tieredImageNet: MetaOpNet improves accuracy. 9] J --> L[CIFAR-FS, FC100: Similar gains, larger gaps. 10] J --> M[Meta-training shot: More shots, one-time training. 11] class A main class B,I fewshot class C,D metalearning class J,K,L,M results

Resume:

1.- Few-shot classification: Computing classification models that generalize to unseen test sets, given few training samples per category.

2.- Prototypical networks: Embeds training samples, computes class prototypes by averaging, classifies test examples based on nearest prototype.

3.- Meta-learning objective: Learning feature embeddings that generalize well across tasks when used with nearest class prototype classifier.

4.- Meta-learning process: Compute classification model, calculate meta loss measuring generalization error, meta-learn embedding by backpropagating error signal across tasks.

5.- Linear predictors (SVM, logistic/ridge regression): Proposed for computing classification model, incorporated convex optimizer in deep network to solve.

6.- Advantages of SVM over nearest neighbor: Adaptive (task-dependent inference-time adaptation), scalable (less overfitting with larger embeddings, outperforms nearest neighbor in high dimensions).

7.- Gradient computation for SVM: Obtained closed-form gradient expression for embedding network without differentiating through optimization, by leveraging convex nature of problem.

8.- Dual formulation: Addressed computational issues with large embedding dimensions by solving dual problem, expressing model as linear combination of training embeddings.

9.- Results on miniImageNet, tieredImageNet: MetaOpNet improves prototypical network accuracy by ~3% with 30-50% inference time increase. Ridge regression variant is comparable.

10.- Results on CIFAR-FS, FC100: Similar performance on CIFAR-FS, 3% improvement over prototypical networks on harder FC100 dataset with larger train/test class gaps.

11.- Influence of meta-training shot: Model performance generally increases with more meta-training shots, enabling one-time high-shot training for all meta-test scenarios.

Knowledge Vault built byDavid Vivancos 2024