Online Time Series Prediction with Missing Data

Oren Anava, Elad Hazan, Assaf Zeevi

**Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:**

graph LR
classDef main fill:#f9d4d4, font-weight:bold, font-size:14px
classDef timeseries fill:#d4f9d4, font-weight:bold, font-size:14px
classDef model fill:#d4d4f9, font-weight:bold, font-size:14px
classDef approach fill:#f9f9d4, font-weight:bold, font-size:14px
classDef results fill:#f9d4f9, font-weight:bold, font-size:14px
Main[Online Time Series

Prediction with Missing

Data] Main --> A[Time Series and Models] A --> A1[Time series: uniform interval

signal observations 1] A --> A2[AR model: noisy linear combination

of previous observations 2] A --> A3[Missing data common

in real-life series 3] A --> A4[Existing models: strict

statistical assumptions 4] Main --> B[Approach and Challenges] B --> B1[Goal: relax assumptions

on missing data 5] B --> B2[Setting: adversary chooses signal,

reveals selectively 6] B --> B3[Regret: player vs

best fixed predictor 7] B --> B4[Challenge: AR undefined

for missing data 8] Main --> C[Solution and Implementation] C --> C1[Solution: replace missing

with recursive predictions 9] C --> C2[AR1 example: observe

or predict recursively 10] C --> C3[Non-linearity: recursive prediction

complicates coefficient learning 11] C --> C4[Non-proper learning: richer model,

compete against AR 12] Main --> D[Prediction and Optimization] D --> D1[Expanded prediction: weight vector,

feature vector 13] D --> D2[Advantage: enables online

convex optimization 14] D --> D3[Result: OsqrtT regret

bound achieved 15] D --> D4[Issues: exponential factors in D 16] Main --> E[Computational Aspects] E --> E1[Partial solution: efficient

inner product computation 17] E --> E2[Inner product: common

good paths coefficients 18] E --> E3[Efficient counting: common

missing points exponent 19] Main --> F[Results and Conclusions] F --> F1[Experiments: synthetic data,

model comparisons 20] F --> F2[Stochastic: similar performance,

online more robust 21] F --> F3[Heteroscedastic: online approach

shows robustness 22] F --> F4[Conclusions: prediction without

stochastic assumptions 23] F --> F5[Open questions: reducing exponential,

multivariate generalization 24] F --> F6[Key aspects: online learning,

recursive prediction 25] class Main main class A,A1,A2,A3,A4 timeseries class B,B1,B2,B3,B4 model class C,C1,C2,C3,C4,D,D1,D2,D3,D4,E,E1,E2,E3 approach class F,F1,F2,F3,F4,F5,F6 results

Prediction with Missing

Data] Main --> A[Time Series and Models] A --> A1[Time series: uniform interval

signal observations 1] A --> A2[AR model: noisy linear combination

of previous observations 2] A --> A3[Missing data common

in real-life series 3] A --> A4[Existing models: strict

statistical assumptions 4] Main --> B[Approach and Challenges] B --> B1[Goal: relax assumptions

on missing data 5] B --> B2[Setting: adversary chooses signal,

reveals selectively 6] B --> B3[Regret: player vs

best fixed predictor 7] B --> B4[Challenge: AR undefined

for missing data 8] Main --> C[Solution and Implementation] C --> C1[Solution: replace missing

with recursive predictions 9] C --> C2[AR1 example: observe

or predict recursively 10] C --> C3[Non-linearity: recursive prediction

complicates coefficient learning 11] C --> C4[Non-proper learning: richer model,

compete against AR 12] Main --> D[Prediction and Optimization] D --> D1[Expanded prediction: weight vector,

feature vector 13] D --> D2[Advantage: enables online

convex optimization 14] D --> D3[Result: OsqrtT regret

bound achieved 15] D --> D4[Issues: exponential factors in D 16] Main --> E[Computational Aspects] E --> E1[Partial solution: efficient

inner product computation 17] E --> E2[Inner product: common

good paths coefficients 18] E --> E3[Efficient counting: common

missing points exponent 19] Main --> F[Results and Conclusions] F --> F1[Experiments: synthetic data,

model comparisons 20] F --> F2[Stochastic: similar performance,

online more robust 21] F --> F3[Heteroscedastic: online approach

shows robustness 22] F --> F4[Conclusions: prediction without

stochastic assumptions 23] F --> F5[Open questions: reducing exponential,

multivariate generalization 24] F --> F6[Key aspects: online learning,

recursive prediction 25] class Main main class A,A1,A2,A3,A4 timeseries class B,B1,B2,B3,B4 model class C,C1,C2,C3,C4,D,D1,D2,D3,D4,E,E1,E2,E3 approach class F,F1,F2,F3,F4,F5,F6 results

**Resume: **

**1.-** Time series definition: sequence of signal observations measured at uniform intervals, with examples from finance, weather, and medicine.

**2.-** Autoregressive (AR) model: each observation is a noisy linear combination of previous observations, with Gaussian noise. Focus is on AR(p) model.

**3.-** Motivation: real-life time series often have missing data due to mistakes or equipment issues. Existing models make strict statistical assumptions.

**4.-** Limitations of existing approaches: assume data follows AR model with Gaussian noise and missing data is random. Misspecification leads to suboptimal results.

**5.-** Goal: relax or remove statistical assumptions on time series with missing data.

**6.-** Setting: adversary chooses signal value arbitrarily at each time point and decides whether to reveal it. Player predicts and suffers loss.

**7.-** Regret criterion: compares player's cumulative loss to loss of best fixed predictor in hindsight. Standard in online learning.

**8.-** Challenge: AR prediction not well-defined for missing data since previous missing values are needed for prediction.

**9.-** Solution: replace missing data with predictions of missing data, using recursive AR predictor up to finite number of steps back (D).

**10.-** Example: For AR(1) with D=2, prediction uses observed value if available, else uses previous prediction recursively.

**11.-** Non-linearity issue: recursive prediction is non-linear in AR coefficients, preventing learning of optimal coefficients using standard online convex optimization techniques.

**12.-** Non-proper learning: generate predictions from a richer model and compete against best AR predictor, without directly learning AR coefficients.

**13.-** Expanded prediction: use weight vector W of dimension 2^D and feature vector phi, giving a much richer linear-in-W prediction.

**14.-** Advantage of expanded prediction: problem becomes linear in W, enabling online convex optimization. Prediction is richer than original AR.

**15.-** Main result: regret bound of O(sqrt(T)) w.r.t. best fixed recursive AR predictor in hindsight, using standard online learning algorithms.

**16.-** Issues: constant factor exponential in D, and time/space complexity also exponential in D due to maintaining/updating high-dimensional W.

**17.-** Partial solution: efficient inner product computation between feature vectors phi enables avoiding explicit maintenance of W.

**18.-** Inner product computation: coefficients depend on number of "common good paths" - paths through missing data points.

**19.-** Efficient path counting: number of common good paths of length k equals 2^(num. common missing points), enabling efficient computation.

**20.-** Experimental results: comparisons on synthetic data capture merits of different models. Real data experiments in paper.

**21.-** Stochastic data: all approaches perform similarly when data and missingness are stochastic. Online methods more robust to AR coefficient changes.

**22.-** Heteroscedastic noise: baselines not robust to non-Gaussian noise, but online approach is robust. Online algorithms faster and simpler to implement.

**23.-** Conclusions: new approach for time series prediction without requiring stochastic assumptions. Theoretical guarantees and good practical performance.

**24.-** Open questions: reducing exponential constant to polynomial in D (some progress in follow-up work), and generalizing to multivariate case.

**25.-** Key aspects: online learning framework, recursive prediction for missing data, non-proper learning, efficient path counting, robustness to model misspecification.

Knowledge Vault built byDavid Vivancos 2024