Google has announced a preview of fine-grained DML (Data Manipulation Language) for BigQuery, aiming to enhance the performance of UPDATE, DELETE, and MERGE statements.
Key Features:
- Granular Approach: Optimizes mutations at a more detailed level than the previous file-group approach
- Reduced Data Rewrites: Aims to minimize the amount of data that needs to be rewritten during operations
- Lower Slot Consumption: Designed to reduce overall slot usage for DML operations
Availability:
- Available for projects assigned to reservations using annual or three-year committed slots
How to Enable:
- Enroll in the preview via the BigQuery fine-grained DML enrollment form
- Use
OPTIONS(enable_fine_grained_mutations = TRUE)
in CREATE TABLE or ALTER TABLE statements
Considerations:
- May incur additional storage costs for mutation metadata
- Processes deleted data offline, with different handling for projects with and without BACKGROUND assignments
- Some limitations apply, including restrictions on table snapshots and replication
Best Practices:
- Group DML operations when possible
- Consider table partitioning for updates on older data
- Use clustered tables for frequent updates on specific column value ranges
This preview aims to provide BigQuery users with more efficient data manipulation capabilities, potentially leading to improved performance for complex data operations.