Google has rolled out several significant enhancements to BigQuery, its enterprise data warehouse solution. These updates include new data preparation features, expanded machine learning capabilities, and performance improvements for materialized views.
BigQuery Data Preparation Now Generally Available
BigQuery data preparation has reached general availability (GA), bringing enhanced data management capabilities to users. The feature incorporates AI-powered suggestions from Gemini to assist with data cleansing, transformation, and enrichment processes.
The update also introduces visual data preparation pipelines, allowing users to build and manage their data workflows through an intuitive interface. Additionally, users can now implement pipeline scheduling with Dataform, enabling automated execution of data preparation tasks.
Remote Models Support for Llama and Mistral AI
Google has expanded BigQuery ML's capabilities by enabling users to create remote models based on Llama and Mistral AI models in Vertex AI. This feature, which is now generally available, enhances the platform's natural language processing capabilities.
Users can leverage the ML.GENERATE_TEXT
function with these remote models to perform generative natural language tasks on text stored in BigQuery tables. Google encourages users to explore this functionality through the "Generate text by using the ML.GENERATE_TEXT
function" tutorial.
Smart-tuning for Materialized Views
The third major update involves smart-tuning support for materialized views. This feature is now generally available and works when materialized views are located in the same project as one of their base tables, or when they are in the project running the query.
Smart-tuning for materialized views aims to optimize performance and resource utilization, potentially delivering faster query results and improved efficiency for data analytics workloads.
These updates collectively strengthen BigQuery's position as a comprehensive analytics platform, offering advanced data preparation, machine learning integration, and performance optimization features for enterprise data needs.