Google has announced the general availability (GA) of time series and range functions in BigQuery, enhancing its capabilities for time series analysis and data manipulation.
Time series analysis involves sequences of data points, each consisting of a time and an associated value. In relational databases, time series data is typically modeled as a table with a time column, optional partitioning columns (such as zip code), and one or more value columns, or a STRUCT
type combining multiple values.
1. Time Series Functions:
DATE_BUCKET
: Determines the lower bound of a date bucket.DATETIME_BUCKET
: Finds the lower bound of a datetime bucket.GAP_FILL
: Identifies and fills gaps in time series data.TIMESTAMP_BUCKET
: Locates the lower bound of a timestamp bucket.
2. Range Functions:
GENERATE_RANGE_ARRAY
: Splits a range into an array of subranges.RANGE
: Constructs date, datetime, or timestamp ranges.RANGE_CONTAINS
: Checks for range inclusion.RANGE_END
: Retrieves the upper bound of a range.RANGE_INTERSECT
: Finds intersecting segments of two ranges.RANGE_OVERLAPS
: Checks if two ranges overlap.RANGE_SESSIONIZE
: Produces sessionized ranges.RANGE_START
: Gets the lower bound of a range.
These new functions are designed to support complex time series analysis in relational databases, where time series data typically includes time columns, optional partitioning columns, and value columns or structs.
This update significantly expands BigQuery's analytical capabilities, allowing users to perform more sophisticated time-based and range-based operations on their data. It's particularly valuable for applications involving temporal data analysis, such as financial modeling, user behavior tracking, and sensor data processing.