Home AI - Artificial Intelligence Cohere Proclaims Its New Aya Vision AI Model as Leading the Industry

Cohere Proclaims Its New Aya Vision AI Model as Leading the Industry

by admin

This week, Cohere’s nonprofit research lab, dedicated to advancing AI, unveiled its multimodal “open” AI model called Aya Vision, which the lab claims sets a new standard in the industry.

Aya Vision is capable of a variety of tasks, such as generating image captions, responding to questions about images, translating text, and summarizing content in 23 widely spoken languages. Cohere has made Aya Vision freely available through WhatsApp, referring to it as “a major advancement in making technological innovations accessible to researchers globally.”

“Despite the advancements in AI, a substantial discrepancy remains in the performance of models across various languages, which becomes even more apparent in multimodal tasks that incorporate both text and images,” stated Cohere in a blog post. “Aya Vision is intended to actively address this gap.”

Aya Vision is available in two versions: Aya Vision 32B and Aya Vision 8B. The more advanced option, Aya Vision 32B, marks a “new frontier,” according to Cohere, outperforming models twice its size, including Meta’s Llama-3.2 90B Vision, on certain visual comprehension benchmarks. On the other hand, the Aya Vision 8B model excels in some assessments compared to models ten times its size, as per Cohere’s claims.

Both versions are accessible through the AI development platform Hugging Face under a Creative Commons 4.0 license, governed by Cohere’s acceptable use policy. These models are not intended for commercial use.

Cohere explained that Aya Vision was developed using a “rich assortment” of English datasets, which the lab translated and employed to create synthetic annotations. These annotations, also referred to as tags or labels, facilitate model comprehension and data interpretation throughout the training phase. For instance, an annotation for training image recognition could involve creating demarcations around objects or providing captions for each person, place, or thing depicted in an image.

Cohere Aya Vision
Cohere’s Aya Vision model can tackle a variety of visual comprehension tasks.Image Credits:Cohere

Cohere’s incorporation of synthetic annotations—annotations produced by AI—aligns with industry trends. Despite potential drawbacks, competitors like OpenAI increasingly utilize synthetic data for model training as access to real-world data diminishes. Research firm Gartner estimates that by last year, 60% of the data utilized for AI and analytics initiatives was synthetically generated.

Cohere remarked that training Aya Vision with synthetic annotations allowed the lab to optimize resource use while achieving competitive performance.

“This demonstrates our essential commitment to efficiency, accomplishing more with less computational power,” Cohere explained in its blog. “This approach also enhances support for the research community, which frequently has limited access to computational resources.”

In conjunction with Aya Vision, Cohere released a new benchmark suite called AyaVisionBench, intended to evaluate a model’s capabilities in “vision-language” tasks, such as distinguishing differences between two images and converting screenshots into code.

The AI sector is currently undergoing what some refer to as an “evaluation crisis,” resulting from the proliferation of benchmarks that yield aggregate scores poorly correlated with actual task proficiency relevant to most AI users. Cohere asserts that AyaVisionBench aims to address this issue, offering a “comprehensive and rigorous” framework for evaluating a model’s understanding across multiple languages and modalities.

With any luck, this proves to be the case.

“[T]his dataset serves as a solid benchmark for assessing vision-language models in multilingual and practical contexts,” Cohere researchers noted in a post on Hugging Face. “We are making this evaluation set accessible to the research community to advance multilingual multimodal evaluations.”

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles