Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- V-usable information: Framework for measuring dataset difficulty based on how much information a model family V can extract about labels from inputs.
2.- Pointwise V-information (PVI): Measure of difficulty for individual instances within a dataset, based on V-usable information framework.
3.- Dataset comparisons: V-usable information allows comparing difficulty of different datasets with respect to the same model.
4.- Model comparisons: V-usable information allows comparing how much information different models can extract from the same dataset.
5.- Input transformations: Technique of applying transformations to isolate input attributes and measure their information content about labels.
6.- Dataset slicing: Analyzing average PVI across different slices/subsets of data to understand difficulty patterns.
7.- Token-level artefacts: Identifying individual tokens that contribute most to model predictions by measuring change in V-information when removed.
8.- Annotation artefacts: Using V-information framework to uncover spurious correlations and biases in datasets that models exploit.
9.- Cross-model consistency: PVI estimates tend to be highly correlated across different model architectures, especially for higher V-information datasets.
10.- Stability across training: PVI estimates remain relatively stable across training epochs and random initializations.
11.- Correlation with human difficulty: Examples humans find easier (higher annotator agreement) tend to have higher PVI.
12.- Mislabeled examples: Instances with very low or negative PVI are often mislabeled.
13.- SNLI dataset analysis: Revealed that token identity alone provides most usable information, and hypothesis-only baselines extract substantial information.
14.- CoLA dataset analysis: Showed less usable information overall compared to SNLI, with certain word classes indicative of ungrammaticality.
15.- Hate speech detection bias: Analysis of DWMW17 dataset revealed potential racial bias in labeling of offensive language.
16.- Information isolation: Technique to measure information content of specific attributes beyond what is captured by other variables.
17.- Dataset cartography comparison: PVI shows correlation with confidence measure from dataset cartography, offering complementary dataset analysis.
18.- Easy-to-learn instances: Correspond to highest average PVI, indicating most usable information for model.
19.- Hard-to-learn instances: Correspond to lowest average PVI, often indicative of mislabeled examples.
20.- Ambiguous instances: Show intermediate PVI values, containing some but not maximum usable information.
21.- Training data sufficiency: Plateauing of V-information estimate with increasing training data indicates sufficient data for estimation.
22.- Model capacity: Larger models tend to extract more V-usable information from datasets.
23.- Overfit detection: V-information more sensitive to overfitting than held-out accuracy.
24.- Dataset difficulty threshold: Similar PVI threshold across datasets where models start making incorrect predictions.
25.- Interpretability advantage: V-information framework offers more principled, interpretable difficulty estimates compared to standard performance metrics.
Knowledge Vault built byDavid Vivancos 2024