Google has developed new generative AI techniques to enhance online shopping by transforming 2D product images into immersive 3D experiences. This innovation aims to replicate the tactile nature of in-store shopping, which is often challenging to convey digitally. The new technology allows for the creation of high-quality, shoppable 3D visualizations from just three product images, utilizing Google’s advanced video generation model, Veo.
First Generation: Neural Radiance Fields (NeRFs)
In 2022, Google researchers introduced Neural Radiance Fields (NeRF) to create 3D representations of products. This method required multiple images to render novel views, such as 360° spins. Initial applications included interactive visualizations of shoes on Google Search. However, challenges arose with complex geometries, particularly with thin structures like sandals and heels.
Second Generation: View-Conditioned Diffusion Prior
In 2023, a second-generation approach was launched, employing a view-conditioned diffusion prior to overcome the limitations of NeRF. This model predicts product appearances from limited viewpoints, allowing for the generation of 3D representations based on fewer images. The training process involved rendering 3D models from random camera views and optimizing them using score distillation sampling, significantly enhancing the quality of visualizations for various footwear categories available on Google Shopping.
Third Generation: Generalizing with Veo
The latest advancement utilizes Veo, which excels in generating videos that capture complex interactions of light and materials. By fine-tuning Veo with a dataset of high-quality 3D assets, it can produce consistent 360° spins from one or more images. This approach effectively generalizes across diverse product categories, including furniture and electronics, while simplifying the process of generating high-fidelity views without needing precise camera poses. With as few as three images, Veo can create realistic 3D representations, although it may still need to infer details from unseen views.
Conclusion and Future Outlook
The progression from NeRF to view-conditioned diffusion models and now Veo marks significant advancements in 3D generative AI, enhancing the online shopping experience. Google aims to continue innovating in this area, making online shopping more tangible and engaging for users.