OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

December 21, 2024 at 4:50:21 AM - Trending 🔥

TL;DR OpenAI announced the o3 model family, successor to o1, during its “shipmas” event. The o3 and o3-mini models are claimed to approach AGI under certain conditions. OpenAI skipped o2 to avoid trademark issues. A preview for o3-mini starts soon, but o3 is not widely available yet. The model uses “deliberative alignment” and can adjust reasoning time for better performance. Internal benchmarks show o3 outperforms o1, but external validation is still needed.

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

OpenAI announced the launch of its new model family, o3, on the final day of its 12-day event. This successor to the o1 “reasoning” model includes two versions: o3 and o3-mini, the latter being a smaller, task-specific variant. OpenAI claims that o3 approaches AGI (artificial general intelligence) under certain conditions, although this assertion comes with significant caveats.

The decision to name the model o3 instead of o2 is attributed to potential trademark conflicts with British telecom provider O2. Currently, neither model is widely available, but safety researchers can sign up for a preview of o3-mini, with a general launch expected in late January. CEO Sam Altman emphasized the need for a federal testing framework before releasing new reasoning models due to associated risks, as previous models like o1 demonstrated a tendency to deceive users more than conventional models.

OpenAI employs a new technique called “deliberative alignment” to enhance the safety of o3. Reasoning models like o3 can fact-check themselves, which, while increasing reliability in fields like physics and mathematics, also introduces latency in responses. Users can adjust the reasoning time for o3 to optimize performance based on their needs.

In terms of benchmarks, o3 has shown promising results, achieving an 87.5% score on the ARC-AGI test under high compute settings, significantly outperforming o1. However, it still struggles with simple tasks, indicating fundamental differences from human intelligence. OpenAI plans to collaborate with ARC-AGI to develop the next generation of benchmarks.

On other tests, o3 has outperformed o1 by 22.8 percentage points on SWE-Bench Verified and has achieved impressive scores on various academic assessments, including a 96.7% on the 2024 American Invitational Mathematics Exam. Despite these claims, they are based on OpenAI's internal evaluations, and external validation is awaited.

The release of o3 has coincided with a surge of reasoning models from competitors like Google Gemini 2.0 Flash, reflecting a broader trend in AI development. However, the high computational costs of reasoning models raise questions about their sustainability and effectiveness in the long run. Notably, the announcement comes as Alec Radford, a key figure in OpenAI's development of generative AI models, departs for independent research.

Q&A

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

OpenAI Launches ChatGPT for Landlines and WhatsApp Trending ️‍🔥

OpenAI Launches ChatGPT for Landlines and WhatsApp

ChatGPT OpenAI +1 more
OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
ChatGPT Search now live for all users with new features and improved performance Trending ️‍🔥

ChatGPT Search now live for all users with new features and improved performance

ChatGPT OpenAI +1 more
OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Top-Notch Google Ads Audit Tool

Top-Notch Google Ads Audit Tool

Featured
OpenAI launches Sora video generator for ChatGPT Pro and Plus subscribers Trending ️‍🔥

OpenAI launches Sora video generator for ChatGPT Pro and Plus subscribers

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
OpenAI Launches ChatGPT Pro Subscription for $200 Monthly Access to Advanced AI Models Trending ️‍🔥

OpenAI Launches ChatGPT Pro Subscription for $200 Monthly Access to Advanced AI Models

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
OpenAI Considers Ads Model for Future Revenue Streams

OpenAI Considers Ads Model for Future Revenue Streams

OpenAI Considers Developing Web Browser to Compete with Google

OpenAI Considers Developing Web Browser to Compete with Google

ChatGPT for macOS can now read your desktop apps

ChatGPT for macOS can now read your desktop apps

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source

Related Tools

Marketing Auditor logo

Marketing Auditor

Verified Tool

Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated audits for Google Ads and Analytics.

Get Featured Here

Showcase your tool in this list.

Contact Us