Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- Time series data is everywhere, from audio features to human motion to stock indices. Modeling high-dimensional time series has many challenges.
2.- Key concepts reviewed: multivariate Gaussians, hidden Markov models (HMMs), vector autoregressive (VAR) processes, state-space models.
3.- HMMs assume an underlying discrete state sequence that is Markov. Observations are conditionally independent given the state. Allows efficient inference.
4.- VARp process: p-dimensional observation is linear combination of p lags plus noise. Stable if companion matrix eigenvalues < 1.
5.- State-space model: Continuous latent Markov state with linear Gaussian dynamics. Observations are conditionally independent given state.
6.- Latent factor models for IID data: Covariance has low-rank + diagonal decomposition. Assumes uncertainty lies in lower-dimensional subspace.
7.- Dynamic latent factor models extend to time series. Markov latent factors project to high-dimensional observations. Subclass of state-space models.
8.- Evolving factor loadings over time allows capturing time-varying covariance structure. Factor structure on loadings enables scalability.
9.- MEG experiment: Classifying brain responses to word categories. Time-varying embeddings outperform, likely due to capturing semantic processing.
10.- Modeling sparsely observed house prices by clustering correlated neighborhoods. Combines state-space models, Dirichlet processes for unknown number of clusters.
11.- Clustering on factor model innovations allows correlated but different latent price series. Improves predictions, especially for sparse series.
12.- Bayesian nonparametric methods like Dirichlet processes allow complexity to adapt to data. Number of clusters grows with observations.
13.- Industry housing indices very noisy at local level due to sparsity. Proposed method enables census-tract-level indices by sharing information.
14.- Gaussian graphical models capture conditional independence via sparsity in inverse covariance. More flexible than marginal independence.
15.- Enforcing identifiability in latent variable models avoids exploring equivalent parameterizations. Trade-off with computational complexity.
16.- Common spatial patterns mentioned as alternative to dimensionality reduction for brain data, but tutorial focuses on general time series.
17.- Break at 347 to ensure covering remaining material.
18.- Tutorial covers three main parts: capturing relationships in high-dimensional time series, scalable modeling, and computationally efficient inference.
19.- Gaussian processes provide flexible prior over latent factor evolution. Squared exponential kernel used, though may not capture expected brain dynamics.
20.- Classification performance assessed by holding out a subset of words and testing category prediction. Time-varying model outperforms others and chance.
21.- Correlation maps at different time points reveal emergence of differential structure during semantic processing window, aiding classification.
22.- Changing how observations are embedded over time allows efficient use of low-dimensional representation to capture evolving high-dimensional covariance.
23.- Clustering of time series allows sharing information when individual series are sparse, e.g. in housing price data.
24.- Seattle housing data analyzed with proposed method. Downtown identified as most volatile. Largest improvement in sparse census tracts.
25.- Case-Shiller housing index very noisy at zip code level, can't be computed at census tract level. Proposed method enables this.
26.- Bayesian approach taken, placing priors on all parameters. Gaussian processes used for latent factor and factor loading dynamics.
27.- Speaker used Microsoft Surface for presentation, causing some technical difficulties with video playback.
28.- Zillow, a Seattle-based housing company, provided data and motivation for the local house price index application.
29.- Various modeling extensions possible, e.g. alternative spatial kernels for brain data, enforcing identifiability, non-stationary latent dynamics.
30.- Q&A covered topics like dictionary choice for factor loadings, stationarity assumptions, identifiability, and alternative methods like common spatial patterns.
Knowledge Vault built byDavid Vivancos 2024