Summary
Microsoft’s VASA-1 can deepfake a person with one photo and one audio track. The model can mimic natural expressions and lip-sync flawlessly, all thanks to training with over 6,000 real-life faces. VASA-1 can pump out high-quality videos at 40 frames per second with no delay. But, there's a catch—Microsoft isn't releasing this powerful tool to the public yet. They want to ensure it's used safely and ethically before letting it loose.