In February, Google put a temporary halt on the image-creating capabilities of its AI-driven chatbot, Gemini, due to concerns over it generating historically inaccurate depictions of people. For instance, a command to produce “a Roman legion” would result in images featuring anachronistically diverse groups, while “Zulu warriors” were depicted in a manner that reinforced stereotypes.
Google’s Sundar Pichai issued an apology, and DeepMind co-founder Demis Hassabis promised a swift solution, anticipated to arrive within weeks. However, the process took considerably longer, even with some Google employees enduring 120-hour weeks. Soon, Gemini will restore the ability for image creation involving people.
Though, this update comes with a caveat.
Initially, only users who are subscribed to Gemini’s paid plans—namely Gemini Advanced, Business, or Enterprise—will have access to this feature, during its English-only early access phase.
Google has not disclosed a timeline for when the feature might become available to users on the free Gemini plan or in additional languages.
“Offering Gemini Advanced users early access to our newest functionalities allows us to collect valuable insights while bringing a highly awaited feature to our premium customers first,” explained a representative to TechCrunch.
What improvements has Google made to address the concerns? The company’s latest image model, Imagen 3, embedded into Gemini, introduces measures deemed to produce more equitable representation in the imagery it generates. This includes training on AI-generated captions aimed at enhancing the variety and fairness of its output, as detailed in a technical document disclosed to TechCrunch. Furthermore, the training data underwent a safety and fairness review, Google attests.
However, when inquired about specifics of Imagen 3’s training dataset, the spokesperson shared only that it was developed from an extensive collection of images, texts, and annotations.
“Our commitment to minimizing undesirable reactions has led us to conduct thorough testing both internally and externally. Collaboration with independent third parties has been pivotal in our continuous efforts to refine the model,” the spokesperson added. “We have taken a meticulous approach to test people generation prior to its reintegration.”
The Imagen 3 Update and Introduction of Gems
Promisingly, all Gemini users will soon receive Imagen 3, although the people-creation feature remains exclusive to the premium plans for now.
According to Google, Imagen 3 surpasses its predecessor, Imagen 2, in accurately converting text prompts into images, manifesting greater creativity and detail with fewer mistakes, setting a new standard in text-based imaging capabilities.
Addressing deepfake concerns, Imagen 3 incorporates SynthID, a DeepMind innovation for embedding undetectable cryptographic marks on AI-generated media. This security feature was previously announced, marking a nuanced stance by Google on image generation across its different platforms, such as the Pixel Studio project.

Alongside Imagen 3, Google introduces a new feature called Gems, exclusively available to Gemini Advanced, Business, and Enterprise subscribers. Much like GPT from OpenAI, Gems allows for the creation of specialized versions of Gemini that serve as experts in specific domains, such as culinary arts or software development.
Google elucidates on this in a blog, “Gems enable the formation of a specialized team of experts, assisting users in brainstorming, project planning, or generating text for social media posts. These AI advisors can also store complex instructions for repetitive or challenging tasks, enhancing productivity and creativity.”
Creating a Gem is as simple as inputting instructions and naming it, according to Google.
Gems are now accessible across desktop and mobile platforms in 150 countries and in a majority of languages, claims Google (though not available in Gemini Live as of now). At launch, users can explore various Gems, including those designed for learning, career advice, brainstorming, and programming assistance.

Upon querying Google about potential features that could enable users to share and utilize Gems similar to how OpenAI’s GPT Store operates, the response was essentially negative.
“Our current focus is understanding how Gems can foster creativity and improve productivity among users,” said the spokesperson. “We have no further details to share at this moment.”
Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence


