Meta AI upgrades: It can see, hear and dub


In the race to make truly useful AI for a mass audience, Meta just jumped forward a few key steps — including AI’s ability to “see” objects and provide live, lip-synched translations.

At the Meta Connect developers’ conference, CEO Mark Zuckerberg unveiled the latest version of Llama. That’s the open-source Large Language Model (LLM) powering the AI chatbot in the company’s main services: Facebook, WhatsApp, Messenger, and Instagram.

Given that reach, Zuckerberg described Meta AI as “the most-used AI assistant in the world, probably,” with about 500 million active users. The service won’t be available in the European Union yet, given that Meta hasn’t joined the EU’s AI pact, but Zuckerberg said he remains “eternally optimistic that we can figure that out.”

He’s also optimistic that the open-source Llama — a contrast to Google’s Gemini and OpenAI’s GPT, both proprietary closed systems — will become the industry standard. “Open source is the most cost-effective and the most customizable,” Zuckerberg said. Llama is “sort of the Linux of AI.”

Mashable Light Speed

Meta AI edits photos by text

But what can you do with it? “It can understand images as well as text,” Zuckerberg added — showing how a photo could be manipulated simply by asking the Llama chatbot to make edits. “My family now spends a lot of time taking photos and making them more ridiculous.”

Voice chat is now rolling out to all versions of Meta AI, including voices from celebrities such as Judi Dench, John Cena and Awkafina. Another user-friendly update: When using Meta AI’s voice assistant with its glasses, you no longer have to use the words “hey Meta” or “look and tell me.”

Zuckerberg and his executives also demonstrated a number of use cases. For example, a user can set up Meta AI to provide pre-recorded responses to frequently asked questions over video. You can use it to remember where you parked. Or you can ask it to suggest items in your room that might help to accessorize a dress.

The most notable, and possibly most useful feature: live translation. Currently available in Spanish, French, Italian and English, the AI will automatically repeat what the other person said in your chosen language. Zuckerberg, who admitted that he doesn’t really know Spanish, demonstrated this feature by having an awkward conversation live on stage with UFC fighter Brandon Moreno.

Slightly more impressive was the live translation option on Reels, and other Meta videos. The AI will synchronize the speakers’ lips so they look like they’re actually speaking the language you’re hearing. Nothing creepy about that at all.





Source link