Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:
Resume:
1.- Vision tasks are related, not independent (e.g. depth estimation, surface normals, object detection, room layout)
2.- Quantifying task relationships enables seeing tasks in concert, not isolation, to utilize redundancies
3.- Reducing need for labeled data is desirable, focus of research on self-supervised learning, unsupervised learning, meta-learning, domain adaptation, ImageNet features, fine-tuning
4.- Task relationships enable transfer learning - using model developed for one task to help solve another related task
5.- Intuitive example: surface normal estimation benefits more from transfer learning from image reshading task than from segmentation task
6.- Quantifying task relationships at scale allows forming complete graph to understand redundancies between tasks
7.- This enables solving set of tasks in concert while minimizing supervision by leveraging redundancies (all tasks transferred from 3 sources)
8.- Also enables solving desired novel task without much labeled data by inserting it into the task relationship structure
9.- Taskonomy: fully computational method to quantify task relationships at scale and extract unified transfer learning structure
10.- Defined set of 26 diverse vision tasks (semantic, 3D, 2D) as sample task dictionary
11.- Collected dataset of 4M real indoor images with ground truth for all 26 tasks
12.- Trained task-specific network for each of 26 tasks, freeze weights
13.- Quantify task relationships by using encoder of one task's network to train small readout network to solve another task
14.- Readout network performance on test set determines strength of directed task transfer relationship
15.- Computed 26x25 transfer functions to get complete directed graph of task relationships
16.- Normalize adjacency matrix of graph using analytic hierarchical process to account for tasks' different output spaces and numerical properties
17.- Extract optimal subgraph from normalized complete graph to maximize collective task performance while minimizing sources used
18.- Subgraph selection also handles transferring to novel tasks not in original dictionary
19.- Higher-order transfers (multiple sources transferring to one target) also included in framework
20.- Experimental results: 26 tasks, 26 task-specific networks, ~3000 transfer functions, 47,000 GPU hours, transfer training used 8-100x less data
21.- Sample computed taxonomy shows intuitive connections (3D tasks connected, semantic tasks connected), enables solving tasks with limited data for some
22.- Gain metric: measures value gained by transfer learning. Quality metric: measures how close transfer results are to task-specific networks.
23.- Live web API to compute taxonomies with custom arguments and compare to ImageNet features baseline
24.- Additional experiments: significance tests, generalization tests, sensitivity analyses, comparisons to self-supervised/unsupervised baselines
25.- Taskonomy is a step towards understanding space of vision tasks and treating tasks as structured space vs isolated concepts
26.- Provides fully computational framework and unified transfer learning model to move towards generalist perception model
27.- Taskonomy outperforms ImageNet feature transfer learning baselines
28.- Includes mechanism to handle novel tasks not in original task dictionary
29.- Can provide guidance for multi-task learning in terms of gauging similarity between tasks
30.- Optimized subgraph maximizes collective performance on all tasks while minimizing number of source tasks
Knowledge Vault built byDavid Vivancos 2024