Google has announced updates to BigQuery ML, enhancing its machine learning capabilities with new embedding support features and AI functionalities. These updates aim to provide users with more powerful tools for data analysis and processing across various modalities.
Expanded Embedding Support
BigQuery ML now offers the following new embedding support features:
Multimodal Embeddings: Users can create embeddings for text, image, and video in the same semantic space using the
ML.GENERATE_EMBEDDING
function with Vertex AI's multimodal embedding large language models (LLMs).Structured Data Embeddings: The
ML.GENERATE_EMBEDDING
function now supports embeddings for structured independent and identically distributed (IID) data using principal component analysis (PCA) or autoencoder models.User/Item Embeddings: Matrix factorization models can be used with the
ML.GENERATE_EMBEDDING
function to create embeddings for user or item data.
To help users leverage these new capabilities, BigQuery has provided several tutorials:
- Generating image embeddings
- Generating video embeddings
- Generating text embeddings
- Generating and searching multimodal embeddings
New AI Features
BigQuery ML has introduced two major AI features:
Document Processing:
- Users can create a remote model based on the Document AI API, specifying a document processor.
- The
ML.PROCESS_DOCUMENT
function can be used with this remote model to process documents from BigQuery object tables.
Audio Transcription:
- A remote model based on the Speech-to-Text API can be created, specifying a speech recognizer.
- The new
ML.TRANSCRIBE
function works with this remote model to transcribe audio files from BigQuery object tables.
Tutorials are available for both of these features:
- Processing documents with the
ML.PROCESS_DOCUMENT
function - Transcribing audio files with the
ML.TRANSCRIBE
function
Availability
All these new embedding support features and AI capabilities are now generally available (GA) in BigQuery ML.
These updates represent a significant enhancement to BigQuery ML's functionality, offering users more sophisticated tools for handling diverse data types, from structured data to text, images, audio, and video. They mark an important step in integrating advanced AI and machine learning capabilities directly into BigQuery's powerful data warehousing and analytics platform.