Google’s Lumiere makes artificial intelligence videos closer to reality than fantasy
Google’s new movie-generating AI model, Lumiere, uses a new diffusion model called Space-Time-U-Net, or STUNet, which can figure out where things are in a movie (space) and how they move and change simultaneously ( time). technical art This method reportedly allows Lumiere to create videos in one process, rather than putting smaller still frames together.
Lumiere starts by building the basic framework according to the prompts. It then uses the STUNet framework to start approximating where objects within that frame will move to create more frames that flow into each other, creating the appearance of seamless motion. Lumiere also produced 80 frames, while Stable Video Diffusion only produced 25 frames.
Granted, I’m more of a text journalist than a video journalist, but the fascinating reels released by Google, as well as the preprint scientific papers, show how AI-powered video generation and editing tools have gone from the uncanny valley in just a few years. A nearly realistic tool. Year. It also established Google’s technology in a space already occupied by competitors such as Runway, Stable Video Diffusion or Meta’s Emu. Runway, one of the first mass-market text-to-video platforms, launched Runway Gen-2 last March and began offering more realistic videos. Track films also struggle to depict movement.
Google kindly placed the clips and tips on the Lumiere website, which allowed me to place the same tips through Runway for comparison. The result is as follows:
Yes, some of the footage presented feels a bit contrived, especially if you look closely at the skin textures or the more atmospheric scenes. But look at that turtle! It moves like a turtle in water! It looks just like a real turtle! I sent the Lumiere introductory video to a friend who is a professional video editor. While she noted that “you could clearly tell it wasn’t entirely real,” she thought it was impressive that if I hadn’t told her it was artificial intelligence, she would have thought it was CGI. (She also said: “It’s going to take away my job, isn’t it?”)
While other models stitch together videos from generated keyframes of motion that has occurred (think of pictures in a flip book), STUNet lets Lumiere focus on the motion itself based on where the generated content should appear at a given time in the video.
Google isn’t a big player in text-to-video, but it has slowly released more advanced AI models and is leaning toward more modal focus. Its Gemini large language model will eventually bring image generation to Bard. Lumiere is not yet available for testing, but it shows Google’s ability to develop an AI video platform that is comparable to, or even slightly better than, commonly used AI video generators such as Runway and Pika. As a reminder, this is where Google was heading in the AI video space two years ago.
In addition to text-to-movie generation, Lumiere will also allow image-to-movie generation, stylized generation (allowing users to create movies in a specific style), movie images that animate only part of a movie, and inpainting to mask an area Change the color or pattern of your video.
However, Google’s Lumiere paper notes that “leveraging our technology to create false or harmful content carries risks of abuse, and we believe it is critical to develop and apply tools for detecting bias and malicious use cases to ensure safe and fair use.” The paper’s The author does not explain how to achieve this.
from Tech Empire Solutions https://techempiresolutions.com/googles-lumiere-makes-artificial-intelligence-videos-closer-to-reality-than-fantasy/
via https://techempiresolutions.com/
from Tech Empire Solutions https://techempiresolutions.blogspot.com/2024/01/googles-lumiere-makes-artificial.html
via https://techempiresolutions.com/
Comments
Post a Comment