OpenAI Unveils o3-mini, Its Newest Model Designed for Advanced Reasoning

On Friday, OpenAI unveiled its latest AI “reasoning” model, o3-mini, which is the newest addition to the company’s family of reasoning models.

OpenAI initially showcased the model in December, alongside its more advanced counterpart, o3. This launch occurs at a crucial time for the organization, as its aspirations and hurdles seem to be escalating daily.

OpenAI is confronting the narrative that it is losing its edge in the AI arena to Chinese firms like DeepSeek, which OpenAI contends may have infringed on its intellectual property. The company has been actively seeking to strengthen its ties with Washington while simultaneously embarking on an ambitious data center initiative and reportedly preparing for one of the largest funding rounds in history.

This brings us to o3-mini, which OpenAI is promoting as both “powerful” and “cost-effective.”

“Today’s launch signifies […] a crucial move towards increasing access to advanced AI, aligning with our mission,” stated a spokesperson from OpenAI to TechCrunch.

Enhanced reasoning capabilities

In contrast to traditional large language models, reasoning models like o3-mini rigorously verify their outputs prior to presenting conclusions. This quality helps them dodge many typical pitfalls associated with AI models. Although these reasoning models may take longer to produce answers, they tend to offer greater reliability—especially in fields such as physics—though they aren’t infallible.

O3-mini has been specifically optimized for STEM-related challenges, focusing on programming, mathematics, and scientific inquiries. OpenAI claims that this model is generally comparable to the o1 series, including o1 and o1-mini, while boasting quicker response times and lower costs.

The company reports that independent testers preferred the responses from o3-mini over those from o1-mini more than half the time. Additionally, o3-mini committed 39% fewer “significant errors” on challenging real-life questions in A/B testing compared to o1-mini, offering “clearer” answers while delivering results around 24% faster.

Starting Friday, o3-mini will be accessible to all users through ChatGPT; however, those who subscribe to OpenAI’s ChatGPT Plus and Team plans will benefit from a higher limit of 150 queries per day. Subscribers of ChatGPT Pro will have unlimited access, and o3-mini is expected to become available for ChatGPT Enterprise and ChatGPT Edu customers within a week, with no updates yet on ChatGPT Gov.

Subscribers with premium plans can opt for o3-mini through a selection menu in ChatGPT. Free users can simply click or tap the new “Reason” button in the chat interface or prompt ChatGPT to “re-generate” a response.

From Friday onward, o3-mini will also be accessible through OpenAI’s API for a select group of developers, although it will initially lack image analysis capabilities. Developers can choose the desired level of “reasoning effort” (low, medium, or high) for o3-mini to allocate more cognitive resources depending on their specific application and required response time.

O3-mini is priced at $0.55 per million cached input tokens and $4.40 per million output tokens, equating to roughly 750,000 words for one million tokens. This cost is 63% lower than that of o1-mini and competes well with DeepSeek’s R1 reasoning model pricing. DeepSeek charges $0.14 per million cached input tokens and $2.19 for each million output tokens through its API for R1.

In the ChatGPT environment, o3-mini defaults to a medium reasoning effort setting, which OpenAI states offers “a reasonable compromise between speed and accuracy.” Subscribers will have the opportunity to select “o3-mini-high” in the model options for better intellect in exchange for increased response times.

Regardless of whether users opt for o3-mini or not, the model will be capable of leveraging search functionalities to produce current answers linked to pertinent online resources. OpenAI has indicated that this feature is still in the “prototype” phase as it integrates search functions into its reasoning models.

“While o1 remains our extensive general-knowledge reasoning model, o3-mini offers a specialized alternative for technical fields demanding precision and quickness,” OpenAI detailed in a blog entry on Friday. “The launch of o3-mini represents another stride in OpenAI’s journey towards advancing cost-efficient intelligence.”

Considerations to keep in mind

O3-mini is not the most advanced model that OpenAI has released thus far, nor does it consistently outshine DeepSeek’s R1 reasoning model across all assessments.

O3-mini outperforms R1 on AIME 2024, an evaluation that gauges the comprehension and response quality of models dealing with complex instructions—but only when set to high reasoning effort. It also excels in programming-related assessments, such as SWE-bench Verified (by a narrow margin), again at high reasoning effort. However, under low reasoning effort, o3-mini falls short against R1 on GPQA Diamond, a test designed to evaluate models on PhD-level questions in physics, biology, and chemistry.

In fairness, o3-mini does handle many inquiries with a competitively low cost and delay. In its announcement, OpenAI draws comparisons between o3-mini’s performance and that of the o1 series:

“With low reasoning effort, o3-mini demonstrates performance levels comparable to o1-mini, whereas under medium effort, it aligns closely with o1’s capabilities,” OpenAI states. “Specifically, o3-mini with medium reasoning effort provides similar performance to o1 in mathematics, coding, and scientific inquiries while generating faster responses. At high reasoning effort, o3-mini outshines both o1-mini and o1.”

It’s important to highlight that o3-mini’s performance edge over o1 is marginal in several domains. For instance, on AIME 2024, it surpasses o1 by just 0.3 percentage points at high reasoning effort, and on GPQA Diamond, o3-mini does not even exceed o1’s score, even at high settings.

OpenAI claims that o3-mini is as “safe” or even safer than the o1 series, thanks to thorough red-teaming strategies and its “deliberative alignment” process, which encourages models to contemplate OpenAI’s safety protocols while responding to queries. The company asserts that o3-mini “greatly exceeds” the performance of one of OpenAI’s leading models, GPT-4o, on rigorous safety and jailbreak tests.

TechCrunch offers an AI-centric newsletter! Sign up here to receive it in your inbox every Wednesday.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

OpenAI Unveils o3-mini, Its Newest Model Designed for Advanced Reasoning

About Us

Top Categories

Latest Articles

Editor's Picks

The reputation of struggling YC...

Roku Introduces Standalone App for...

Meta Launches Initial Testing of...

Meta’s Natural Gas Surge Could...

OpenAI Unveils o3-mini, Its Newest Model Designed for Advanced Reasoning

Enhanced reasoning capabilities

Considerations to keep in mind

Microsoft Establishes New Division to Explore the Effects of AI

Sam Altman’s Departure from OpenAI Has Become a Cultural Phenomenon

You may also like

About Us

Top Categories

Latest Articles

Editor's Picks