Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- LLMs are primarily System 1 reactive rather than System 2 deliberative processors
2.- Humans can switch between System 1 and 2; LLMs cannot
3.- Claims about LLM reasoning abilities are often misinterpreted human validation
4.- LLMs are impressive as System 1 processors but lack deliberative capabilities
5.- Simple block-stacking problems reveal LLMs' inability to reason logically
6.- LLMs always attempt solutions, even for unsolvable problems
7.- LLMs are N-gram models performing approximate retrieval with constant hallucination
8.- Multiple prompt attempts until success indicates human reasoning, not LLM
9.- LLMs excel at style but struggle with factual correctness
10.- Systematic evaluation shows poor performance on basic planning tasks
11.- Obfuscated versions of problems severely impact LLM performance
12.- Chain of thought prompting doesn't scale beyond training examples
13.- Last-letter concatenation experiments show limited advice-taking capability
14.- React prompting faces similar limitations as chain of thought
15.- LLMs cannot effectively self-critique their solutions
16.- External verifiers perform better than LLM self-criticism
17.- Computational complexity doesn't affect LLM response time
18.- Fine-tuning shows poor generalization beyond training problem size
19.- In vs. out of distribution metrics aren't useful for LLM
20.- LLMs lack deductive closure capabilities
21.- Planning requires understanding action interactions, which LLMs struggle with
22.- Removing preconditions improves LLM performance but eliminates planning necessity
23.- LLMs excel at providing approximate domain knowledge
24.- LLM modular frameworks combine generation with external verification
25.- Human intervention should be minimized in LLM systems
26.- Back-prompting can improve accuracy within reasonable iterations
27.- LLMs can effectively critique style rather than correctness
28.- Alpha proof and similar systems use external verifiers
29.- LLMs are better as generators than reasoners
30.- LLMs remain valuable tools when used appropriately within limitations
Knowledge Vault built byDavid Vivancos 2024