The Google Gemini generative AI logo on a smartphone.
Home AI - Artificial Intelligence One of Google’s latest Gemini AI models shows reduced performance in safety assessments.

One of Google’s latest Gemini AI models shows reduced performance in safety assessments.

by admin

A recent internal benchmark assessment by Google has revealed that its Gemini 2.5 Flash AI model performs worse in safety tests compared to its predecessor, Gemini 2.0 Flash. According to a recently published technical report, the newer model shows a decline in both “text-to-text safety” and “image-to-text safety,” scoring 4.1% and 9.6% lower respectively.

Text-to-text safety measures the frequency with which the model violates Google’s guidelines when given a specific prompt, while image-to-text safety evaluates adherence to these guidelines in relation to image prompts. Notably, both tests are automated rather than overseen by human evaluators. A Google spokesperson confirmed the findings, acknowledging that Gemini 2.5 Flash had diminished safety performance in these areas.

These results present a contradiction to a broader industry trend where AI companies are increasingly aiming to make their models more permissive, reducing their tendency to shy away from sensitive topics. For example, Meta has designed its Llama models to avoid endorsing particular views and to engage with contentious political prompts. Similarly, OpenAI has announced plans to adjust future models to provide various perspectives on controversial subjects, avoiding any editorial bias.

However, attempts to increase model permissiveness can lead to unintended consequences. Recently, it was reported that OpenAI’s ChatGPT allowed minors to engage in inappropriate conversations, which the company attributed to a “bug.”

Despite the regressions in safety, Google claims the Gemini 2.5 Flash model adheres more strictly to instructions, although this also includes responding to prompts that cross sensitive boundaries. The report mentions that some of the observed safety violations may stem from false positives, but also recognizes that the model is capable of generating content that violates guidelines when directed.

The report further indicates that Gemini 2.5 Flash is less likely to refuse answering sensitive or controversial prompts compared to its predecessor. For instance, testing by TechCrunch through the OpenRouter AI platform showed the model willingly producing essays advocating for controversial changes to U.S. legal practices and government surveillance.

Thomas Woodside, co-founder of the Secure AI Project, raised concerns regarding the transparency of Google’s model evaluations. He highlighted a challenge in balancing instruction fidelity with policy compliance, noting that more details about specific policy violations within the testing parameters would be beneficial for independent assessments.

Google has faced scrutiny for its transparency in model safety reports in the past. It took several weeks for the company to release safety data for its advanced model, Gemini 2.5 Pro, and the initial release lacked important details. A more comprehensive report was subsequently published, addressing some of these concerns.

Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles