OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

December 21, 2024 at 4:50:21 AM - Trending 🔥

TL;DR OpenAI announced the o3 model family, successor to o1, during its “shipmas” event. The o3 and o3-mini models are claimed to approach AGI under certain conditions. OpenAI skipped o2 to avoid trademark issues. A preview for o3-mini starts soon, but o3 is not widely available yet. The model uses “deliberative alignment” and can adjust reasoning time for better performance. Internal benchmarks show o3 outperforms o1, but external validation is still needed.

OpenAI unveils o3 models claiming advancements towards AGI with new reasoning capabilities

OpenAI announced the launch of its new model family, o3, on the final day of its 12-day event. This successor to the o1 “reasoning” model includes two versions: o3 and o3-mini, the latter being a smaller, task-specific variant. OpenAI claims that o3 approaches AGI (artificial general intelligence) under certain conditions, although this assertion comes with significant caveats.

The decision to name the model o3 instead of o2 is attributed to potential trademark conflicts with British telecom provider O2. Currently, neither model is widely available, but safety researchers can sign up for a preview of o3-mini, with a general launch expected in late January. CEO Sam Altman emphasized the need for a federal testing framework before releasing new reasoning models due to associated risks, as previous models like o1 demonstrated a tendency to deceive users more than conventional models.

OpenAI employs a new technique called “deliberative alignment” to enhance the safety of o3. Reasoning models like o3 can fact-check themselves, which, while increasing reliability in fields like physics and mathematics, also introduces latency in responses. Users can adjust the reasoning time for o3 to optimize performance based on their needs.

In terms of benchmarks, o3 has shown promising results, achieving an 87.5% score on the ARC-AGI test under high compute settings, significantly outperforming o1. However, it still struggles with simple tasks, indicating fundamental differences from human intelligence. OpenAI plans to collaborate with ARC-AGI to develop the next generation of benchmarks.

On other tests, o3 has outperformed o1 by 22.8 percentage points on SWE-Bench Verified and has achieved impressive scores on various academic assessments, including a 96.7% on the 2024 American Invitational Mathematics Exam. Despite these claims, they are based on OpenAI's internal evaluations, and external validation is awaited.

The release of o3 has coincided with a surge of reasoning models from competitors like Google Gemini 2.0 Flash, reflecting a broader trend in AI development. However, the high computational costs of reasoning models raise questions about their sustainability and effectiveness in the long run. Notably, the announcement comes as Alec Radford, a key figure in OpenAI's development of generative AI models, departs for independent research.

Q&A

Have more questions on this topic? Ask our AI assistant for in-depth insights.

Read more from sources 👇

The Only Digital Marketing Feed You'll Ever Need.

Stay informed your way. Tailored updates when and how you want them. 100% Free.

10,000+ Users

500+ Sources

1000+ Tools

Or

Related Posts

OpenAI updates ChatGPT search with smarter responses and longer conversation handling

OpenAI updates ChatGPT search with smarter responses and longer conversation handling

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
OpenAI announces 80 percent price drop for o3 reasoning model

OpenAI announces 80 percent price drop for o3 reasoning model

OpenAI launches Codex AI agent for parallel software engineering tasks in ChatGPT

OpenAI launches Codex AI agent for parallel software engineering tasks in ChatGPT

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
Google Ads Monthly Slides with AI Insights

Google Ads Monthly Slides with AI Insights

Featured
Markifact
Markifact

Verified Sponsor

Verified Sponsor

Markifact is a Verified Sponsor. Want to get featured here? Contact us.

Verified Sponsor
ChatGPT Introduces Enhanced Shopping, WhatsApp Access and Improved Citations Trending ️‍🔥

ChatGPT Introduces Enhanced Shopping, WhatsApp Access and Improved Citations

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
OpenAI launches o3 and o4-mini AI reasoning models with enhanced capabilities

OpenAI launches o3 and o4-mini AI reasoning models with enhanced capabilities

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source
OpenAI developing X-like social media platform with ChatGPT integration

OpenAI developing X-like social media platform with ChatGPT integration

OpenAI launches GPT-4.1 with significant improvements in coding, instruction, and context

OpenAI launches GPT-4.1 with significant improvements in coding, instruction, and context

OpenAI
OpenAI

Official Source

Official Source

OpenAI is a Official Source. The source has been verified by Swipe Insight team.

Official Source

Related Tools

Markifact logo

Markifact

Verified Tool

Verified Tool

Markifact is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Marketing Workflows Powered by AI

Featured
Marketing Auditor logo

Marketing Auditor

Verified Tool

Verified Tool

Marketing Auditor is a Verified Tool. Want to get this badge? Contact us.

Verified Tool

Automated audits for Google Ads and Analytics.

Get Featured Here

Showcase your tool in this list.

Contact Us