Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- Imperfect information games: Games where players have partial information about the current game state.
2.- Perfect recall: Players remember their previous moves and information sets form a tree structure.
3.- ε-optimal strategies: Strategies that are within ε of the optimal strategy in terms of expected value.
4.- Regret minimization: Minimizing the difference between cumulative gain and gain of the best fixed strategy.
5.- Information sets: Sets of game states indistinguishable to a player.
6.- Realization plan: Probability of reaching a given information set-action pair.
7.- Counterfactual regret: Regret defined on information sets rather than full game states.
8.- Implicit exploration (IX): Technique to reduce variance in loss estimates.
9.- Balanced policy: Sampling actions proportionally to size of associated sub-trees.
10.- Follow the Regularized Leader (FTRL): Algorithm that minimizes regret by following the best regularized strategy.
11.- Tsallis entropy: Regularizer used in FTRL algorithms.
12.- Shannon entropy: Alternative regularizer used in FTRL algorithms.
13.- Dilated entropy: Entropy regularizer applied across the game tree.
14.- Balanced transitions: Transition kernel defined to balance exploration across the game tree.
15.- Adaptive learning rates: Learning rates that adapt based on observed game structure.
16.- High probability bounds: Bounds that hold with high probability rather than just in expectation.
17.- Lower bounds: Theoretical lower limits on regret or sample complexity.
18.- Sample complexity: Number of game realizations needed to learn an ε-optimal strategy.
19.- Trajectory feedback: Learning from observed game trajectories rather than full game information.
20.- Balanced FTRL: Algorithm using balanced transitions to achieve optimal rates with known structure.
21.- Adaptive FTRL: Algorithm adapting to unknown structure while maintaining near-optimal rates.
22.- BIAS terms: Components of regret related to estimation bias.
23.- REG term: Component of regret related to regularization.
24.- VAR term: Component of regret related to variance of estimates.
25.- Azuma-Hoeffding inequality: Concentration inequality for bounded martingale difference sequences.
26.- Freedman's inequality: Concentration inequality for martingales with bounded conditional variance.
27.- Time complexity: Computational cost of algorithm updates.
28.- Kuhn poker: Simple poker game used as a benchmark.
29.- Leduc poker: More complex poker variant used as a benchmark.
30.- Liars dice: Dice game used as a benchmark for imperfect information algorithms.
Knowledge Vault built byDavid Vivancos 2024