Google has announced the general availability (GA) of JavaScript User-Defined Aggregate Functions (UDAF) in BigQuery, allowing users to create custom aggregate functions using JavaScript.
The JavaScript UDAF creation process requires several mandatory components:
Core Function Requirements
- JavaScript code must be provided as a quoted string literal
- Implementation of four essential JavaScript functions:
- initialState: Initializes the results
- aggregate: Aggregates the data
- merge: Merges results
- finalize: Finalizes the results
Technical Specifications
- Only specific type encodings are permitted
- All values returned by initialState or remaining in state arguments must be serializable
- Non-serializable aggregation data requires use of included serialize and deserialize functions
Persistent UDAF Features
The platform now supports persistent UDAFs with the following characteristics:
- Reusable across multiple queries
- Secure sharing between owners
- No data mutation capabilities
- No external system communication
- No logging to Google Cloud Observability
To create a persistent UDAF, users must:
- Use the CREATE AGGREGATE FUNCTION statement
- Omit TEMP or TEMPORARY keywords
- Include dataset in function path
These functions maintain data integrity and security while providing enhanced aggregation capabilities within BigQuery's environment.
CREATE OR REPLACE AGGREGATE FUNCTION my_project.my_dataset.SumPositive(x FLOAT64)
RETURNS FLOAT64
LANGUAGE js
AS r'''
export function initialState() {
return {sum: 0}
}
export function aggregate(state, x) {
if (x > 0) {
state.sum += x;
}
}
export function merge(state, partialState) {
state.sum += partialState.sum;
}
export function finalize(state) {
return state.sum;
}
''';