Gemini Live, Google’s answer to ChatGPT’s Advanced Voice Mode, launches

Comment
Gemini Live, Google’s answer to the recently launched (in limited alpha) Advanced Voice Mode for OpenAI’s ChatGPT, is rolling out on Tuesday, months after being announced at Google’s I/O 2024 developer conference. It was announced at Google’s Made by Google 2024 event.
Gemini Live lets users have “in-depth” voice chats with Gemini, Google’s generative AI-powered chatbot, on their smartphones. Thanks to an enhanced speech engine that delivers what Google claims is more consistent, emotionally expressive and realistic multi-turn dialogue, people can interrupt Gemini while the chatbot’s speaking to ask follow-up questions, and it’ll adapt to their speech patterns in real time.
Here’s how Google describes it in a blog post: “With Gemini Live [via the Gemini app], you can talk to Gemini and choose from [10 new] natural-sounding voices it can respond with. You can even speak at your own pace or interrupt mid-response with clarifying questions, just like you would in any conversation.”
Gemini Live is hands-free if you want it to be. You can keep speaking with the Gemini app in the background or when your phone’s locked, and conversations can be paused and resumed at any time.
So how might this be useful? Google gives the example of rehearsing for a job interview — a bit of an ironic scenario, but OK. Gemini Live can practice with you, Google says, giving speaking tips and suggesting skills to highlight when speaking with a hiring manager (or AI, as the case may be).
One advantage Gemini Live might have over ChatGPT’s Advanced Voice Mode is a better memory. The architecture of the generative AI model underpinning Live, Gemini 1.5 Pro and Gemini 1.5 Flash, has a longer-than-average “context window,” meaning they can take in and reason over a lot of data — theoretically hours of back-and-forth conversations — before crafting a response.
“Live uses our Gemini Advanced models that we have adapted to be more conversational,” a Google spokesperson told TechCrunch via email. “The model’s large context window is utilized when users have long conversations with Live.”
We’ll have to see how well this all works in practice, of course. If OpenAI’s setbacks with Advanced Voice Mode are any indication, rarely do demos translate seamlessly to the real world.
On that subject, Gemini Live doesn’t have one of the capabilities Google showcased at I/O just yet: multimodal input. Back in May, Google released pre-recorded videos showing Gemini Live seeing and responding to users’ surroundings via photos and footage captured by their phones’ cameras — for example, naming a part on a broken bicycle or explaining what a portion of code on a computer screen does.
Multimodal input will arrive “later this year,” Google said, declining to provide specifics. Also later this year, Live will expand to additional languages and to iOS via the Google app; it’s only available in English for the time being.
Gemini Live, like Advanced Voice Mode, isn’t free. It’s exclusive to Gemini Advanced, a more sophisticated version of Gemini that’s gated behind the Google One AI Premium Plan, priced at $20 per month.
Other new Gemini features on the way are free, though.
Android users can soon (in the coming weeks) bring up Gemini’s overlay on top of any app they’re using to ask questions about what’s on the screen (e.g., a YouTube video) by holding their phone’s power button or saying, “Hey Google.” Gemini will be able to generate images (but still not images of people, unfortunately) directly from the overlay — images that can be dragged and dropped into apps like Gmail and Google Messages.
Gemini is also gaining new integrations with Google services (or “extensions,” as the company prefers to call them) both on mobile and the web. In the coming weeks, Gemini will be able to take more actions with Google Calendar, Keep, Tasks, YouTube Music and Utilities, the apps that control on-device features like timers and alarms, media controls, the flashlight, volume, Wi-Fi, Bluetooth and so on.
In a blog post, Google gives a few ideas of how people might take advantage. Sounds nifty, assuming it all works reliably:
Lastly, starting later this week, Gemini will be available on Android tablets.
Every weekday and Sunday, you can get the best of TechCrunch’s coverage.
Startups are the core of TechCrunch, so get our best coverage delivered weekly.
The latest Fintech news and analysis, delivered every Tuesday.
TechCrunch Mobility is your destination for transportation news and insight.
By submitting your email, you agree to our Terms and Privacy Notice.
Google Pixel 9 series India launch coincides with the expansion of its sales channels and after-sales support in the country.
Zepto has finalized a $340 million round that increases its valuation to $5 billion, up from $3.6 billion in June and $1.4 billion last August, as the startup races to…
Let’s dive right into what the Google Pixel 9 lineup looks like, how Google’s Gemini AI will be incorporated in the devices, and more.
We rounded up some of the more intriguing AI-related announcements that didn’t get a ton of play, like Pixel Studio.
Ben Affleck and Matt Damon have acquired a screenplay called “Killing Gawker,” which presumably delves into billionaire VC Peter Thiel’s campaign to bury the media outfit for posting excerpts from…
Google launched Gemini Live during its Made by Google event Tuesday. The feature allows you to have a semi-natural spoken conversation, not typed out, with an AI chatbot powered by…
Texas filed a lawsuit Tuesday against GM over years of alleged abuse of customers’ data and trust. New car owners were presented with a “confusing and highly misleading” process that…
Chinese autonomous vehicle company WeRide has received the green light to test its driverless vehicles with passengers in California.  The step comes as WeRide begins the process to go public…
Kristen Faulkner astonishing Olympic success of two gold medals stems from lessons learned from her former career as a venture capitalist, she says.
SB 1047 has drawn the ire of Silicon Valley players large and small, including venture capitalists, big tech trade groups, researchers and startup founders.
For many companies, this also means that now is the time to start implementing these algorithms.
The United Auto Workers union said Tuesday that it filed federal labor charges against Donald Trump and Elon Musk. The union alleges that Trump and Musk attempted to “threaten and…
With the introduction of the Pixel Watch 3 smartwatches, which now come in two sizes, Google is also introducing a new, potentially life-saving feature: loss of pulse detection. At the…
Made You Look will be available on the Pixel 9 Pro Fold when it launches next month.
Made by Google 2024 kicks off at 10 a.m. PT on August 13. Get ready for a slew of new hardware, including the Pixel 9 and a new foldable.
With the new app, users “won’t have to scroll through a bunch of numbers to get a sense of the day’s weather,” according to Google.
Meta has been spotted working on a new feature that would allow a post — as well as all its replies — to disappear in 24 hours.
In line with Elon Musk’s increasing rhetoric on X of going back to the good old days, Tesla looks like it’s finally getting ready to open its 1950s-style diner.  Tesla…
Google is adding features for photo editing, plus new apps for storing and searching through screenshots on-device and an AI-powered studio for image generation.
The $229 Pixel Buds Pro 2 start shipping September 26.
At Made by Google 2024, the company announced that the new Pixel 9 phones are the first devices to ship with Gemini as the assistant by default.
In addition to the Pixel Watch 3’s 41mm model, the smartwatch will also be available in 45mm.
Gemini Live lets users have “in-depth” voice chats with Gemini, Google’s generative AI-powered chatbot, on their smartphones.
The Pixel 9 starts at $799, the Pixel Pro at $999 and Pixel 9 Pro XL at $1,099. Preorders open Tuesday.
Time will tell whether the new “Pro” bit in Pixel 9 Fold presages the arrival of a lower-cost foldable, or if it’s simply a nod to the high-end pricing.
The U.S. Appeals Court for the Fifth Circuit said geofence search warrants are “categorically prohibited” under the Fourth Amendment.
Welcome to TechCrunch Fintech! This week, we’re looking at Payoneer’s $61 million acquisition of Skuad, Robinhood and Dave’s second-quarter results, X’s progress on its payments and more. To get a…
Labeling and annotation platforms might not get the attention flashy new generative AI models do. But they’re essential. The data on which many models train must be labeled, or the…
The Asia Pacific region has long been an important market for wealth management firms with the plethora of developing economies and a burgeoning retail investment market. But there’s still a…
After launching photo and text stories last year, Twitch is now introducing video stories. Streamers can now film 60-second videos in the Twitch mobile app or upload a video from…
Powered by WordPress VIP

source
Sponsor:News technical sponsor
Sponsor:News AI sponsor
Sponsor: AI sponsor
Sponsor: AI sponsor

Leave a Comment

Vélemény, hozzászólás?

Az e-mail címet nem tesszük közzé. A kötelező mezőket * karakterrel jelöltük