- Meta just launched Movie Gen, an AI video generator to compete with OpenAI’s Sora.
- Movie Gen can create videos with accompanying audio using a text prompt. It can also be edited per prompt.
- Meta entered the video generation race later than OpenAI and Google.
Meta released a new AI video generation tool on Friday, which is also the company’s latest attack in its battle with OpenAI for AI supremacy.
“Today we are excited to premiere Meta Movie Gen, our groundbreaking generative AI research for media, spanning modalities such as image, video and audio,” the company said in a press release. “Movie Gen outperforms comparable models in the industry on these tasks when judged by humans.”
In its press release, Meta Movie called Gen the “most advanced and immersive set of storytelling models,” including video generation, audio generation, personalized video generation and video editing. The models are trained using publicly available data and licensed data, the company said.
Using a text prompt, Movie Gen can create videos of up to 16 seconds at 16 frames per second while reasoning “about object motion, subject-object interactions, and camera movement.” Users can upload a photo of themselves to incorporate into personalized videos, and Movie Gen can edit videos with text instructions from the user.
Meta’s preview video shows an underwater perspective of a baby hippo (Moo Deng reference, anyone?) happily swimming around in a serene water scene.
Another shows a koala on a surfboard and the accompanying prompt: “A fluffy koala is surfing. It has gray and white fur and a round nose. The surfboard is yellow. The koala bear holds the surfboard with its paws. The koala’s facial expression of the bear is focused. The sun is shining.”
Audio generation allows users to “create and extend sound effects, background music or entire soundtracks” up to 45 seconds in length, according to the press release. An example clip of a snake gliding through a wooded area includes the prompt: “Rusting leaves and snapping twigs, with an orchestral musical number.”
Meta is a bit late to the audio and video generation game, as top competitors like OpenAI and Google have already gained a foothold in the space. OpenAi launched Sora, its video generator, in February, and Google followed suit with Veo in May.
However, Meta has given OpenAI a run for its money in the AI arms race. Although OpenAI’s ChatGPT debuted first and brought the company global fame, recent versions of Meta’s Llama model have been well received. Many considered Llama 3.1, which was released in July, to be superior to OpenAI’s GPT-4o, which was released shortly before.
Meta says its new “state-of-the-art models” outperform competitors in human A/B comparisons. For video generation, Metas surveyed preferred Movie Gen over OpenAI Sora, according to the company’s press release. Meta didn’t share an A/B comparison with Google’s Veo, which also offers sound effects and music, but Meta said in a lengthy accompanying research paper that it believes Google’s video-to-audio generation models may be more limited in length than Meta’s . .
Meta, OpenAI and Google did not immediately respond to a request for comment.