BigQuery Introduces AI-Augmented Data Preparation with Gemini

October 25, 2024 at 5:58:58 AM

TL;DR BigQuery data preparation uses AI to clean, transform, and enrich data, reducing manual effort. Dataform supports CI/CD processes. Users need IAM roles. Managed in BigQuery Studio, Gemini provides context-aware suggestions. Views include data, graph, and schema views. Write modes: full refresh, append, and incremental. Supported steps: source, transformation, filter, validation, join, and delete columns.

BigQuery Introduces AI-Augmented Data Preparation with Gemini

AI-augmented data preparation in BigQuery, powered by Gemini, offers intelligent suggestions for cleaning, transforming, and enriching data, significantly reducing manual effort. Dataform orchestrates these preparations, supporting CI/CD processes for collaboration.

Benefits

  • Time Reduction: Context-aware, Gemini-generated transformation suggestions.
  • Data Quality: Automated schema mapping and data quality cleanup.
  • Collaboration: CI/CD support for code reviews and source control.

Users and Dataform service accounts need specific IAM roles. Data preparations are managed in BigQuery Studio. Opening a table triggers a BigQuery job that samples data for Gemini to generate suggestions.

Views in the Data Preparation Editor

  • Data View: Displays a sample of the table and allows interaction and application of Gemini suggestions.
  • Graph View: Visual overview of the data preparation pipeline.
  • Schema View: Displays and allows operations on the current schema.

Gemini offers context-aware suggestions for transformations, data quality rules, standardization, enrichment, and schema mapping. Each suggestion includes a high-level category, description, and corresponding SQL expression.

BigQuery uses data sampling to preview data preparation. Samples are not automatically refreshed. Optimize costs and processing time by changing write mode settings to incrementally process new data. Supported modes include Full refresh, Append, and Incremental.

Supported Data Preparation Steps

  • Source: Adds a source table or join step.
  • Transformation: Cleans and transforms data using SQL expressions.
  • Filter: Removes rows using WHERE clause syntax.
  • Validation: Sends rows meeting validation criteria to an error table.
  • Join: Joins values from two sources with various join operations.
  • Destination: Defines where to output data preparation steps.
  • Delete Columns: Removes columns from the schema.

Schedule one-time or recurring data preparation runs from the data preparation editor or manage them from the BigQuery Orchestration page. BigQuery data preparation does not have its own API. Contact bq-datapreparation-feedback@google.com for more information.

Limitations

  • Source and destination datasets must be in the same location.
  • Data and interactions are processed in a US data center during pipeline editing.
  • No support for natural language SQL query generation or viewing/comparing/restoring data preparation versions.
  • Gemini responses are based on a sample of the dataset.

For more detailed steps and configurations, refer to the BigQuery documentation below.

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources πŸ‘‡

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

BigQuery Introduces Fine-Grained DML for Optimized Data Mutations

BigQuery Introduces Fine-Grained DML for Optimized Data Mutations

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Google Analytics 4 Enhances BigQuery Export with New Session Traffic Source Data

Google Analytics 4 Enhances BigQuery Export with New Session Traffic Source Data

Brais Calvo
Brais Calvo

Top Creator

Top Analytics Creator

Brais Calvo is a Top Analytics Creator. Part of Swipe Insight Select, a curated list of top creators.

Top Analytics Creator
The Ultimate Google Analytics Audit Tool

The Ultimate Google Analytics Audit Tool

Sponsored
GA4 Auditor
GA4 Auditor

Verified Sponsor

Verified Sponsor

GA4 Auditor is a Verified Sponsor. Want to get featured here? Contact us.

Verified Sponsor
Google Unveils BigQuery Pipe Syntax for Easier SQL Queries Trending ️‍πŸ”₯

Google Unveils BigQuery Pipe Syntax for Easier SQL Queries

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
BigQuery Introduces Airflow DAG Management Features in Preview

BigQuery Introduces Airflow DAG Management Features in Preview

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
BigQuery Launches External Datasets Linked to Spanner for Federated Querying

BigQuery Launches External Datasets Linked to Spanner for Federated Querying

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
BigQuery Enhances Query Performance, Naming Flexibility, and Monitoring with New Features

BigQuery Enhances Query Performance, Naming Flexibility, and Monitoring with New Features

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Google Cloud Console Enhances BigQuery with New Keyboard Shortcuts

Google Cloud Console Enhances BigQuery with New Keyboard Shortcuts

Google Cloud
Google Cloud

Official Source

Official Source

Google Cloud is a Official Source. The source has been verified by Swipe Insight team.

Official Source

Related Tools

GA4 Auditor logo

GA4 Auditor

Verified Tool

Verified Tool

GA4 Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated GA4 audits with actionable insights

Get Featured Here

Showcase your tool in this list.

Contact Us
GA4 SQL logo

GA4 SQL

Verified Tool

Verified Tool

GA4 SQL is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Generate GA4 BigQuery queries easily

Data Analysis
TapClicks logo

TapClicks

Automated marketing solutions powered by your data

Data Engineering
Stitch logo

Stitch

Automated cloud data pipelines, no coding needed

Data Engineering
Akkio logo

Akkio

AI-powered analytics for agencies

Data Analysis
Databricks logo

Databricks

Generative AI-powered data intelligence platform

Data Engineering
NinjaCat logo

NinjaCat

AI-powered marketing data and analytics platform

Reporting
Funnel logo

Funnel

Aggregate and analyze marketing data seamlessly

Reporting
Fivetran logo

Fivetran

Effortlessly centralize and move data from any source

Data Engineering
Power My Analytics logo

Power My Analytics

Automate and integrate your marketing data

Reporting

Get Featured Here

Showcase your tool in this list.

Contact Us