Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- Introduction to natural language understanding (NLU): Defining NLU, relating it to the Turing test and AI/intelligence.
2.- IBM Watson and real-world NLU: Watson winning at Jeopardy in 2011, ensemble of techniques. NLU creeping into daily life.
3.- Potential techniques/representations for NLU: Word/phrase vectors, dependency trees, frames, logical forms. Interconnections will be explored.
4.- NLP & machine learning - transfer of ideas: HMMs, CRFs, LDA developed for NLP problems. Sequences/LSTMs driven by machine translation.
5.- Goals of tutorial: Provide NL intuitions, describe state-of-the-art, understand challenges & opportunities. Conceptual, not algorithmic.
6.- Syntax, semantics, pragmatics: Levels of linguistic analysis. Analogies with programming languages.
7.- Syntax - constituency & dependency: Dependency parse trees, parts of speech, relations. Capturing meaning requires more than just syntax.
8.- Semantics - lexical & compositional: Figuring out word meanings and composing them. Word sense ambiguity.
9.- Lexical semantics - synonymy: Semantically similar words. A distance metric more than equivalence classes. Other relations like hyponymy, meronymy.
10.- Compositional semantics - model theory & compositionality: Sentences refer to the world. Meaning of the whole composed from parts. Quantifiers, modals, beliefs.
11.- Pragmatics - conversational implicature, presupposition: What the speaker is trying to convey based on context. Background assumptions.
12.- Language processing challenges - vagueness, ambiguity, uncertainty: Definitions and examples of each. Uncertainty due to lack of knowledge.
13.- Coreference and anaphora: Resolving pronouns like "it" depending on context. The Winograd Schema Challenge.
14.- Recap - syntax, semantics, pragmatics: Definitions, key ideas, and challenges in each.
15.- Distributional semantics motivation: Context reveals a lot about word meaning. Historical linguistic basis.
16.- General recipe for distributional semantics: Form word-context matrix of counts, do dimensionality reduction. Context and dimensionality reduction vary.
17.- Latent Semantic Analysis (LSA): Early distributional semantics method using documents as context and SVD for dimensionality reduction.
18.- Part-of-speech induction: Unsupervised learning of parts-of-speech using surrounding word contexts and SVD.
19.- SGNS/Word2vec model: Predicts word-context pairs using logistic regression, optimizes PMI. Produces dense word embeddings.
20.- Probing word vectors: Captures notions of synonymy, co-hyponymy, even antonymy. Downstream sentiment vs instruction-following implications.
21.- Word vector analogies: Vector differences capture relations like gender, plurality. Explanation via context differences.
22.- Other distributional models: HMMs, LDA, neural nets, recursive neural nets for phrase embeddings.
23.- Hearst patterns: Lexico-syntactic patterns like "X such as Y" that reveal hypernymy. Mining them from large corpora.
24.- Summary of distributional semantics: Context as lossless semantics, general recipe, captures nuanced usage information, useful for downstream tasks.
25.- Context selection considerations: No purely unsupervised learning, context is part of model. Global vs local, patterns, non-textual context.
26.- Limitations of current distributional models: Highly parameterized, compression may not be the right view, intention not fully captured.
27.- Examples to ponder: Sentences with same/different meanings, how distributional methods handle or fail on them.
Knowledge Vault built byDavid Vivancos 2024