You can now use BigQuery ML's multivariate time series ARIMA_PLUS_XREG
models for anomaly detection. This feature allows for detecting anomalies in both historical and new data with multiple feature columns and is generally available (GA). You can explore this feature through the "Perform anomaly detection with a multivariate time-series forecasting model" tutorial.
ML.DETECT_ANOMALIES Function
The ML.DETECT_ANOMALIES
function in BigQuery ML enables anomaly detection for time series data using ARIMA_PLUS and ARIMA_PLUS_XREG models. For independent and identically distributed random variables (IID) data, it uses autoencoder, k-means, or principal component analysis (PCA) models.
CREATE MODEL Statement for ARIMA_PLUS_XREG Models
The CREATE MODEL
statement is used to create multivariate time series models in BigQuery. Forecasting is performed when the model is created, and you can retrieve forecasting values and compute prediction intervals using the ML.FORECAST
and ML.EXPLAIN_FORECAST
functions.
Time Series Modeling Pipeline
The multivariate ARIMA_PLUS_XREG time series model includes linear external regressors. The modeling pipeline for ARIMA_PLUS time series models performs several functions:
- Infer data frequency
- Handle irregular time intervals
- Handle duplicated timestamps by averaging
- Interpolate missing data using local linear interpolation
- Detect and clean spike and dip outliers
- Detect and adjust abrupt step changes
- Detect and adjust holiday effects
- Detect multiple seasonal patterns using Seasonal and Trend decomposition using Loess (STL)
- Extrapolate seasonality using double exponential smoothing (ETS)
- Detect and model trends using the ARIMA model and auto.ARIMA algorithm for automatic hyperparameter tuning, selecting the best model based on the lowest Akaike information criterion (AIC).