OpenAI

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

December 21, 2024 at 4:50:21 AM - Trending 🔥

TL;DR OpenAI announced the o3 model family, successor to o1, during its “shipmas” event. The o3 and o3-mini models are claimed to approach AGI under certain conditions. OpenAI skipped o2 to avoid trademark issues. A preview for o3-mini starts soon, but o3 is not widely available yet. The model uses “deliberative alignment” and can adjust reasoning time for better performance. Internal benchmarks show o3 outperforms o1, but external validation is still needed.

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

OpenAI announced the launch of its new model family, o3, on the final day of its 12-day event. This successor to the o1 “reasoning” model includes two versions: o3 and o3-mini, the latter being a smaller, task-specific variant. OpenAI claims that o3 approaches AGI (artificial general intelligence) under certain conditions, although this assertion comes with significant caveats.

The decision to name the model o3 instead of o2 is attributed to potential trademark conflicts with British telecom provider O2. Currently, neither model is widely available, but safety researchers can sign up for a preview of o3-mini, with a general launch expected in late January. CEO Sam Altman emphasized the need for a federal testing framework before releasing new reasoning models due to associated risks, as previous models like o1 demonstrated a tendency to deceive users more than conventional models.

OpenAI employs a new technique called “deliberative alignment” to enhance the safety of o3. Reasoning models like o3 can fact-check themselves, which, while increasing reliability in fields like physics and mathematics, also introduces latency in responses. Users can adjust the reasoning time for o3 to optimize performance based on their needs.

In terms of benchmarks, o3 has shown promising results, achieving an 87.5% score on the ARC-AGI test under high compute settings, significantly outperforming o1. However, it still struggles with simple tasks, indicating fundamental differences from human intelligence. OpenAI plans to collaborate with ARC-AGI to develop the next generation of benchmarks.

On other tests, o3 has outperformed o1 by 22.8 percentage points on SWE-Bench Verified and has achieved impressive scores on various academic assessments, including a 96.7% on the 2024 American Invitational Mathematics Exam. Despite these claims, they are based on OpenAI's internal evaluations, and external validation is awaited.

The release of o3 has coincided with a surge of reasoning models from competitors like Google Gemini 2.0 Flash, reflecting a broader trend in AI development. However, the high computational costs of reasoning models raise questions about their sustainability and effectiveness in the long run. Notably, the announcement comes as Alec Radford, a key figure in OpenAI's development of generative AI models, departs for independent research.

Q&A

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Related Tools

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Marketing Workflows Powered by AI

Workflow Automation

Featured

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Automated audits for Google Ads and Analytics.

Ad Management

Get Featured Here

Showcase your tool in this list.

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

Q&A

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Official Source

Related Posts

OpenAI Launches Shopping Research Assistant to Help Find the Right Products

AWS and OpenAI announce multi-year $38B partnership for AI infrastructure

PayPal and ChatGPT Partner to Enable Instant Checkout Within ChatGPT App

Top-Notch Google Ads Audit Tool

Microsoft Extends OpenAI Partnership Through 2032 with 135 Billion Dollar Stake

OpenAI launches ChatGPT Atlas browser with AI-powered browsing and agent mode

OpenAI launches apps in ChatGPT and new AgentKit for developers

OpenAI launches Sora app and Sora 2 model for realistic video and audio creation

Related Tools

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

Q&A

What is the o3 model from OpenAI?

How does the reasoning process work in the o3 model?

When will the o3 and o3-mini models be available?

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

Official Source

Related Posts

Related Tools

Markifact Verified Tool Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Auditor Verified Tool Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Get Featured Here

Markifact
Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Marketing Auditor
Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.