Home AI - Artificial Intelligence First Impressions of Gemini Live: Superior to Conversing with Siri, Yet Below Expectations

First Impressions of Gemini Live: Superior to Conversing with Siri, Yet Below Expectations

by admin

During the recent Made By Google event in Mountain View, California, on a Tuesday, Google unveiled Gemini Live. This innovation enables users to engage in semi-natural voice dialogue with an AI chatbot, leveraging Google’s advanced large language model. TechCrunch had the opportunity to experience it live.

Responding to OpenAI’s Advanced Voice Mode introduced by ChatGPT, Google has launched Gemini Live, a strikingly similar function currently undergoing alpha testing on a limited scale. Though OpenAI introduced the concept first, Google leads in deploying the fully developed version.

From my perspective, the quick, voice-activated responses of Gemini Live provide a more intuitive experience than texting with ChatGPT or engaging with voice assistants like Siri or Alexa. Gemini Live impressively delivers responses within two seconds and adapts smoothly to on-the-fly changes in conversation. While not without flaws, it stands out as the most effective hands-free mobile device use I’ve encountered.

The Mechanics Behind It

Gemini Live provides a choice of 10 distinct voices, a notable increase from OpenAI’s three options, thanks to collaborations with voice actors. The diversity and human-like quality of these voices added significantly to the experience.

In a practical test, a Google product manager asked Gemini Live to locate kid-friendly wineries near Mountain View with outdoor seating and nearby playgrounds. This complex inquiry surpassed typical demands made of Siri or Google Search, and yet, Gemini Live adeptly recommended Cooper-Garrod Vineyards in Saratoga, fitting the specified needs perfectly.

However, Gemini Live’s performance isn’t flawless. During the demonstration, it incorrectly suggested a playground, Henry Elementary School Playground, claimed to be “10 minutes away” from the selected vineyard. In reality, the closest playgrounds are in Saratoga, and no Henry Elementary School exists within a reasonable distance. The nearest similar institution, Henry Ford Elementary School, is located in Redwood City, 30 minutes away.

A highlighted feature allows users to interrupt Gemini Live mid-discourse, prompting it to swiftly redirect the conversation. However, in practice, this feature occasionally led to overlaps between the interlocutions of Google’s project managers and Gemini Live, with the latter sometimes missing cues.

Significantly, according to product manager Leland Rechis, Gemini Live is restricted from singing or impersonating any voices beyond the 10 provided, a choice likely aimed at circumventing copyright controversies. Additionally, Rechis noted that Google is not currently concentrating on enabling Gemini Live to detect emotional nuances in speech, a capability showcased by OpenAI in its presentation.

In essence, Gemini Live represents a significant stride towards more naturalistic in-depth exploration of topics, far surpassing the capabilities of a simple Google Search. According to Google, Gemini Live is a precursor to Project Astra, the comprehensive multimodal AI model introduced at Google I/O. While presently limited to voice interactions, Google envisions incorporating real-time video understanding in the future.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles