Knowledge Vault 6 /5 - ICML 2015
Bayesian Time Series Modeling: Structured Representations for Scalability
Emily Fox
< Resume Image >

Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:

graph LR classDef main fill:#f9d4d4, font-weight:bold, font-size:14px classDef concepts fill:#d4f9d4, font-weight:bold, font-size:14px classDef models fill:#d4d4f9, font-weight:bold, font-size:14px classDef applications fill:#f9f9d4, font-weight:bold, font-size:14px classDef methods fill:#f9d4f9, font-weight:bold, font-size:14px classDef misc fill:#d4f9f9, font-weight:bold, font-size:14px Main[Bayesian Time Series
Modeling: Structured Representations
for Scalability] Main --> A[Time Series Concepts] A --> A1[Time series data: ubiquitous,
high-dimensional challenges 1] A --> A2[Key concepts: Gaussians, HMMs,
VAR, state-space 2] A --> A3[HMMs: discrete Markov states,
efficient inference 3] A --> A4[VARp: linear combination of
lags, noise 4] A --> A5[State-space: continuous Markov state,
linear Gaussian 5] Main --> B[Latent Factor Models] B --> B1[Latent factor models: low-rank
covariance decomposition 6] B --> B2[Dynamic factors: Markov latent,
high-dimensional projections 7] B --> B3[Evolving loadings: time-varying covariance,
scalable factors 8] Main --> C[Applications] C --> C1[MEG experiment: time-varying embeddings
outperform others 9] C --> C2[House prices: state-space, Dirichlet
process clustering 10] C --> C3[Clustering innovations: correlated prices,
improved predictions 11] C --> C4[Local indices: proposed method
enables tract-level 13] C --> C5[Seattle analysis: downtown volatile,
sparse improvement 24] Main --> D[Methods and Approaches] D --> D1[Bayesian nonparametrics: adaptive complexity,
growing clusters 12] D --> D2[Gaussian graphical models: conditional
independence, inverse 14] D --> D3[Identifiability: avoids equivalent parameterizations,
computational trade-off 15] D --> D4[Common spatial patterns: alternative
dimensionality reduction 16] D --> D5[Gaussian processes: flexible prior,
squared exponential 19] Main --> E[Results and Analysis] E --> E1[Classification: held-out words, category
prediction 20] E --> E2[Correlation maps: reveal semantic
processing structure 21] E --> E3[Changing embeddings: efficient low-dimensional
evolving covariance 22] E --> E4[Time series clustering: sharing
sparse information 23] E --> E5[Case-Shiller index: proposed method
enables tract-level 25] Main --> F[Miscellaneous] F --> F1[Break ensures covering remaining
material 17] F --> F2[Tutorial parts: relationships, scalability,
efficient inference 18] F --> F3[Bayesian approach: priors, Gaussian
process dynamics 26] F --> F4[Technical difficulties: Microsoft Surface
video playback 27] F --> F5[Zillow provided data, local
index motivation 28] F --> F6[Extensions: spatial kernels, identifiability,
non-stationary dynamics 29] F --> F7[Q&A: dictionary, stationarity, identifiability,
alternative methods 30] class Main main class A,A1,A2,A3,A4,A5 concepts class B,B1,B2,B3 models class C,C1,C2,C3,C4,C5 applications class D,D1,D2,D3,D4,D5,E,E1,E2,E3,E4,E5 methods class F,F1,F2,F3,F4,F5,F6,F7 misc

Resume:

1.- Time series data is everywhere, from audio features to human motion to stock indices. Modeling high-dimensional time series has many challenges.

2.- Key concepts reviewed: multivariate Gaussians, hidden Markov models (HMMs), vector autoregressive (VAR) processes, state-space models.

3.- HMMs assume an underlying discrete state sequence that is Markov. Observations are conditionally independent given the state. Allows efficient inference.

4.- VARp process: p-dimensional observation is linear combination of p lags plus noise. Stable if companion matrix eigenvalues < 1.

5.- State-space model: Continuous latent Markov state with linear Gaussian dynamics. Observations are conditionally independent given state.

6.- Latent factor models for IID data: Covariance has low-rank + diagonal decomposition. Assumes uncertainty lies in lower-dimensional subspace.

7.- Dynamic latent factor models extend to time series. Markov latent factors project to high-dimensional observations. Subclass of state-space models.

8.- Evolving factor loadings over time allows capturing time-varying covariance structure. Factor structure on loadings enables scalability.

9.- MEG experiment: Classifying brain responses to word categories. Time-varying embeddings outperform, likely due to capturing semantic processing.

10.- Modeling sparsely observed house prices by clustering correlated neighborhoods. Combines state-space models, Dirichlet processes for unknown number of clusters.

11.- Clustering on factor model innovations allows correlated but different latent price series. Improves predictions, especially for sparse series.

12.- Bayesian nonparametric methods like Dirichlet processes allow complexity to adapt to data. Number of clusters grows with observations.

13.- Industry housing indices very noisy at local level due to sparsity. Proposed method enables census-tract-level indices by sharing information.

14.- Gaussian graphical models capture conditional independence via sparsity in inverse covariance. More flexible than marginal independence.

15.- Enforcing identifiability in latent variable models avoids exploring equivalent parameterizations. Trade-off with computational complexity.

16.- Common spatial patterns mentioned as alternative to dimensionality reduction for brain data, but tutorial focuses on general time series.

17.- Break at 347 to ensure covering remaining material.

18.- Tutorial covers three main parts: capturing relationships in high-dimensional time series, scalable modeling, and computationally efficient inference.

19.- Gaussian processes provide flexible prior over latent factor evolution. Squared exponential kernel used, though may not capture expected brain dynamics.

20.- Classification performance assessed by holding out a subset of words and testing category prediction. Time-varying model outperforms others and chance.

21.- Correlation maps at different time points reveal emergence of differential structure during semantic processing window, aiding classification.

22.- Changing how observations are embedded over time allows efficient use of low-dimensional representation to capture evolving high-dimensional covariance.

23.- Clustering of time series allows sharing information when individual series are sparse, e.g. in housing price data.

24.- Seattle housing data analyzed with proposed method. Downtown identified as most volatile. Largest improvement in sparse census tracts.

25.- Case-Shiller housing index very noisy at zip code level, can't be computed at census tract level. Proposed method enables this.

26.- Bayesian approach taken, placing priors on all parameters. Gaussian processes used for latent factor and factor loading dynamics.

27.- Speaker used Microsoft Surface for presentation, causing some technical difficulties with video playback.

28.- Zillow, a Seattle-based housing company, provided data and motivation for the local house price index application.

29.- Various modeling extensions possible, e.g. alternative spatial kernels for brain data, enforcing identifiability, non-stationary latent dynamics.

30.- Q&A covered topics like dictionary choice for factor loadings, stationarity assumptions, identifiability, and alternative methods like common spatial patterns.

Knowledge Vault built byDavid Vivancos 2024