Google BigQuery has rolled out an update to its SQL functionality. Users can now use GROUP BY and SELECT DISTINCT operations with arrays and structs, offering more flexible data analysis capabilities.
Key Features of the Update:
GROUP BY
with Structs: Users can group by entire struct fields in a single operation.SELECT DISTINCT
with Arrays and Structs: Allows for more nuanced data deduplication across complex data types.- Improved Efficiency: These operations can reduce query complexity for certain types of analyses.
Practical Application:
Users can now group by all session traffic source fields with a single GROUP BY struct statement. This capability streamlines queries that previously might have required multiple GROUP BY clauses.
Example Query:
SELECT
session_traffic_source_last_click.manual_campaign,
COUNTIF(event_name = "session_start")
FROM
`project.dataset.events_table`
GROUP BY 1
In this example, we're grouping by a struct field manual_campaign within session_traffic_source_last_click
and counting session starts. This query demonstrates how the new functionality allows for more concise and efficient grouping operations on complex data structures.
This update is particularly useful for working with nested and repeated data structures, common in web analytics and event logging. It aligns with BigQuery's goal of providing powerful tools for large-scale data analysis.