OpenAI GPT-5 One unified system
After such a long wait, we get the most factual ChatGPT ever. But is that enough?
Hey Everyone,
Welcome to another article from AI Supremacy, a Newsletter about AI at the intersection of tech, business, society and the future. Welcome to our new readers who are joining 170,000 others. Check the rising publications on our channel of Technology here. Welcome to join our new community.
So I listened to OpenAI’s 1 hour live stream about GPT-5’s launch, watch it here. In this article
and her team from Decoding Discontinuity (please skip the OpenAI summary to read if if you want) Newsletter will have some insights about OpenAI in the marketplace and their strategic evolution.Some of GPT-5’s talking points are crazy exaggerated, but that’s the world we live in now, so please take some of these quotes with a grain of salt. This article is fairly long, so click on the title or click below to read it on the web (it will be easier to see the infographics).
(Like) “Having a team of PhDs in your pocket. “
As of the second week of August, 2025 - GPT-5 is rolling out today to Free, Plus, Pro, and Team users, with Enterprise and EDU next week.
It tooks OpenAI 162 days to get from GPT-4.5 to GPT-5. GPT‑4.5 was officially released on February 27, 2025. But the wait for GPT-5 felt like years! That’s also because Anthropic, Google and Qwen caught up, and we haven’t even gotten DeepSeek-R2 yet.
TL;DR
OpenAI’s GPT-5 may not be as good as we had hoped, but it’s a product suited for their unique approach to AI.
Unified System with Auto-Switching
Advanced Reasoning Capabilities
Enhanced Multimodal Functionality
Improved Coding Performance
Reduced Hallucinations and Improved Safety
Customizable Personalities and Interface
Expanded Context Window: 256k
Integration with External Tools
Sycophancy Reduction
Accessibility and Pricing: API pricing is now at $1.25/million input tokens and $10/million output tokens, with reasoning tokens counted as output.
“It is a unified system that automatically switches between providing a quick response and taking time to reason through a hard problem to provide the best answer.” - Srinivas Narayanan
Mostly a Quality of Life upgrade 💫
GPT-5 is not a breakthrough, but an iterative improvement over GPT-4.5, with some areas, like writing, reportedly weaker than predecessors. While GPT-5 is a significant step forward, it’s not a revolutionary leap toward artificial general intelligence (AGI) or a meaningful improvement outside of some quality of life features. That’s not to say that the slick interface and improved customization in design, voice and personality isn’t significant.
OpenAI’s ChatGPT Personas 😂
Preset Personalities: Users can choose from four new preset personalities to customize how ChatGPT interacts:
Cynic (sarcastic, blunt)
Robot (dry, precise)
Listener (calm, supportive)
Nerd (curious, explanatory)
These are opt-in, adjustable anytime, and create different conversational styles without needing custom prompts.
Great Customization and Personalization
I think GPT-5’s memorable part is not its unified system but rather the moves in the personalization and customization direction. Easily Voice Mode to me gets the highest marks here:
Voice Mode Customization: The updated Voice mode can adjust tone, pace, and response length based on user instructions. Voice features are more adaptive and available to all users with higher usage limits for paid subscribers. Voice supports custom GPTs but currently retains a standard default voice behavior separate from the personalities available in text.
One word answer feature is super engaging for me!
OpenAI seems to have noticed that Health questions are a big deal for ChatGPT users.
Share your Take:
Share your insights on this note:
“It’s been a great year for health AI, both for performance at the frontier and at cost. We’ve gone from 0% (GPT-4o) to 46% (GPT-5 thinking) on HealthBench Hard, a health benchmark built with 250+ doctors.” - Karan Singhal
Health
GPT‑5 is their best model yet for health-related questions, empowering users to be informed about and advocate for their health. The model scores significantly higher than any previous model on HealthBench.
Evaluations
GPT‑5 is much smarter across the board, as reflected by its performance on academic and human-evaluated benchmarks, particularly in math, coding, visual perception, and health.
It sets a new state of the art across math (94.6% on AIME 2025 without tools), real-world coding (74.9% on SWE-bench Verified, 88% on Aider Polyglot), multimodal understanding (84.2% on MMMU), and health (46.2% on HealthBench Hard)—and those gains show up in everyday use.
Claude Opus 4.1 got 74.5% on SWE-bench for the record. Barely any difference. Meanwhile Claude Code dominates via Cursor, Loveable and of course Github Copilot (ironically).
In the demos of GPT-5 Coding, they even talked about it “vibe coding” for folk. It was super weird that they borrowed this term.
Voice in GPT-5
You can use it with video so it sees what you see. Free users can chat for hours. Subscribers can customize experiences better with different settings: e.g. one word answers, concise or more elaborate.
The voice in voice-mode is audiably more natural, fluent and responsive.
Design
You can customize the colors of your chat and the personalities. (Supportive, Sarcastic, Professional, etc…) . I like these customization and personality QOL features.
Coding
GPT‑5 shows significant gains in benchmarks that test instruction following and agentic tool use, the kinds of capabilities that let it reliably carry out multi-step requests, coordinate across different tools, and adapt to changes in context.
“Expertise on demand, at PhD depth”
According to the talking points of Siya Raj Purohit: “GPT-3 felt like a bright high school student. GPT-4 like a sharp undergrad. GPT-5 works like a panel of doctoral-level experts from different disciplines debating your problem, challenging each other, and converging on the best solution.”
Reasonable API Cost
$1.25 input / $10 output per million tokens - See Full pricing list.
Showcases
Full disclosure, these read less like actual reviews, and more like promotions (the sad reality of today’s world:
GPT-5 Hands-On: Welcome to the Stone Age
GPT-5: It Just Does Stuff
GPT-5 and the arc of progress
An exclusive inside look at GPT-5 - Listen on YouTube.
🎨 Interface Personalization: Users, especially on paid tiers, can select accent colors for UI elements like conversation bubbles and highlighted text, helping organize and personalize the visual chat experience.
A lot of the paid features are simply designed for you to upgrade for them to maximize revenue.
OpenAI: No Longer a Frontier Lab 🗺️
Still GPT-5 is fun for us writers and creators to speculate upon, in part just because we’ve been waiting for GPT-5 for so long. While it is anticlimactic and won’t be a frontier model for a long time, this marks OpenAI as a full-fledged B2C product company and no longer the frontier lab we once knew.
System Card
Go deeper into the models and Unified GPT-5 system.
Unified System Structure: GPT-5 is a cohesive system comprising:
gpt-5-main: A fast, high-throughput model for general queries.
gpt-5-thinking: A deeper reasoning model for complex problems.
Real-time router: Dynamically selects the appropriate model based on query complexity, user intent (e.g., “think hard”), and conversation type, continuously improving through user feedback and performance metrics.
Mini versions (gpt-5-main-mini, gpt-5-thinking-mini) handle queries after usage limits, and a gpt-5-thinking-nano is available for developers via the API.
gpt-5-thinking-pro: Accessible in ChatGPT with parallel test-time compute for enhanced performance.
There’s already been a lot of community commentary around how good or bad GPT-5 Thinking is. The system card focuses on gpt-5-main and gpt-5-thinking, with evaluations for other models in the appendix. For full details, see: https://openai.com/index/gpt-5-system-card/
AI Alignment
GPT-5 has supposedly less hallucinations and less tendencies to sycophancy. I’m sure they tried, as far as this startup is capable of taking trust, saftey and alignment seriously with its present leadership and unique history.
notes that OpenAI also released an insane amount of guides on how to use GPT-5. Well of course they did!“But what it (GPT-5) really brings to the table is the fact that it just does things…The burden of using AI is lessened. “ - Ethan Mollick
To estimate gpt-5-thinking’s autonomous software capabilities, METR used the methodology outlined in their recent paper. It does paint GPT-5 in a more favorable light than other evaluations. Very hard to believe they are truly independent as they claim.
GPT-5 Is Here: There's Only One Feature Worth Writing About
Vibe-Check: GPT-5: Our hands-on review of OpenAI's newest model based on weeks of testing
Artificial Analysis, METR and others were chosen to give praise!
What’s clear of course is this is not a Frontier Model that will last very long in today’s dynamic LLM global arena. This is more of an upgrade to ChatGPT’s interface. Don’t be fooled by the hypers, Techno Optimists, preachers and Evangelists. A lof the mentioned voices - it’s literally their job to praise OpenAI (the dominant product right now of ChatGPT).
Many of these organizations also have partnerships with OpenAI or investors with connections to them.
Details about METR’s evaluation of OpenAI GPT-5
Is GPT-5 Frontier in Agentic AI?
GPT-5 hot take: nice work but still no humpback whale 😂
Read the comments of GPT-5 on Hacker news.
What does Reddit say about GPT-5? 🤔
Read some comments on Reddit: GPT-5 is a disaster.
I’m sorry but I’m being reasonable
GPT-5 is a massive downgrade.
GPT-5 will be better in a lot of fields.
GPT-5 thoughts? (later in the day these types of comments began to appear on X).
🔴 YouTube GPT-5 Hot takes
OpenAI’s own YouTube Short promo. (OAI)
Surprising developers with GPT-5. (OAI)
Powering Creativity (writing) with GPT-5 (OAI)
Sam Altman interview with Cleo Abram. (about GPT-5)
So reactions are fairly mixed and doesn’t feel like a step towards AGI. As a baseline, real human comments on Reddit are more negative and “AI Creators” are more positive on YouTube and Substack.
If anything, it just feels like a more polished ChatGPT product for sales. OpenAI still has a very important place in the BigAI pantheon that includes:
OpenAI is likely to lose its first place in the new era of BigAI companies and labs in 2026.
Who is BigAI?
OpenAI
Google DeepMind
Anthropic
xAI
Meta Superintelligence Lab
Thinking Machines Lab
DeepSeek
Qwen
BigAI are for me the leading labs who will likely build the best consumer and Enterprise AI products around AI. I do not consider Apple, Amazon, Microsoft to be in this group. BigAI are likely disruptive for other BigTech companies who aren’t well diversified like potentially Meta. Microsoft and Amazon are fairly well diversified companies.
“It’s beyond a collaborator, it’s almost like a mentor.” (source)
OpenAI created tons of Videos and Blogs about GPT-5
Some of these are fairly speculative and subjective:
Introducing GPT‑5 for developers
GPT-5 for Scientific Discovery (video)
Empowering a Medical Researcher (video)
“Fundamentally it will change the way we do Science”.
But in reality?
GPT-5 is just good enough Technically
Good enough to seem state of the art (SOTA), but not technically a big leap relative to other frontier labs.
Polymarket on Who will be the Best model maker by September, 2025
Many point out that Anthropic’s Claude Opus 4.1 is about as good as GPT-5’s best model at coding.
On Polymarket, people aren’t even convinced OpenAI have the best models. They are betting Google will be first by the end of august.
This gives you an idea of the relative lack of confidence of machine learning researchers and the public on OpenAI’s technical capabilities heading into the second half of 2025.
Yet they will be making $20 Billion in ARR revenue by the end of 2025. That kind of growth is what’s getting Silicon Valley excited and has unleashed a very strong Techno-Optimism lobbying push. (even when the models don’t do what the company claims or have a realistic path forward, AGI, etc…)
For OpenAI it’s clear that a scaling bottleneck was reached that they weren’t able to solve very well. We might see diminishing returns even with better GPUs and bigger datacenters. Unless of course you are Chinese and were forced to solve important software, reinforcement leading and hardware efficiency problems.
Lower Sycophancy Behaviors in GPT-5.
In the Model card OpenAI explain how Sycophancy and Hallucinations have been tackled and improved.
If you don’t want Hallucinations, use “GPT-5 Thinking”
Health Performance
GPT-5 is noticeably optimized for health questions from users.
Multilingual Performance
ChatGPT is now a lot better in languages like Arabic, Bengali, Chinese and French.
How much better is GPT-5 at Coding?
Conclusion: While GPT-5 is better than previous OpenAI models, it does not seem to be SOTA in coding.
Compare this to Opus 4.1
Not really statistically better according to these figures, not the same as other figures OpenAI was marketing in a more public way.
An MLE Surprise
Developed by the Preparedness team, MLE-bench evaluates an agent’s ability to solve Kaggle challenges involving the design, building, and training of machine learning models on GPUs.
GPT Agent is better than GPT-5! GPT-Thinking inferior to ChatGPT Agent here is really peculiar.
Learn more METR’s evaluation on Slide 40.
A significant portion of the GPT-5 system card is just about cybersecurity and AI alignment. Very little useful data to help us truly evaluate GPT-5’s capabilities.
Even on Twitter, there was surprisingly little useful commentary on GPT-5, even the night after it was released. Which feels suspicious.
Artificial Analysis
AA is a more popular place to compare models, capabilities and prices of late.
INTELLIGENCE
PRICE
Anthropic models don’t feature on their leaderboards.
Google’s Rethinking how we measure AI Intelligence. (e.g. Kaggle Game Arena). Kaggle says the Leaderboard is coming soon.
View the LMArena. They rank GPT-5 with an August 8th score of 1481.
Since OpenAI doesn’t release new models very often any longer, frontier models now lead only for a few weeks to months, sometimes days.
Grok 4 Beats GPT-5 on GPQA (Scientific Reasoning)
*This may not have been updated in a timely fashion on their website.
GPT-5 Ranks well on Maths
GPT-5 offers four reasoning effort configurations: high, medium, low, and minimal.
GPT‑5 not only outperforms previous models on benchmarks and answers questions more quickly, but—most importantly—is more useful for real-world queries.
GPT‑5 shows significant gains in benchmarks that test instruction following and agentic tool use, and multi-step requests (agentic).
GPT-5 sets a new standard with a score of 68 on AA’s Artificial Analysis Intelligence Index (MMLU-Pro, GPQA Diamond, Humanity’s Last Exam, LiveCodeBench, SciCode, AIME, IFBench & AA-LCR) at High reasoning effort.
Conclusion:
There’s a lot to unpack here, and while GPT-5 is an important model for ChatGPT’s evolution, perhaps less so for the Generative AI ecosystem as a whole.
For many of us it did not live up to the hype or the six months of waiting since GPT-4.5. It certainly doesn’t feel noticeably closer to what OpenAI refers to as AGI (not to be confused with historical definitions of the same).
The Deep Dive
Read their last piece with us here.
Decoding Discontinuity is a great newsletter to think about macro topics in Generative AI and among leading frontier labs. Decoding Discontinuity examines the architectural vulnerabilities in business models that conventional metrics fail to capture, revealing the structural shifts that determine survival in the generative AI landscape. Learn more.
GPT-5's Reinforcement Learning Gambit: Paradigm Shift or PR in the Agentic Era?
By
(she wrote this before the launch of GPT-5)In AI, discontinuities aren't incremental improvements—they're threshold moments that redefine entire industries. As I explored in my previous piece, 'AI's New Frontier: Orchestration as the Source of Asymmetric Returns,' the agentic era is defined by orchestration layers that transform commoditized model intelligence into defensible moats and scalable economic value. With GPT-5's launch imminent—reports indicate a roll-out potentially as soon as next week—the critical question isn't whether OpenAI ships another model, but the potential impact of its reinforcement learning (RL) innovations on the AI industry. Is GPT-5 a Discontinuity? Can it deliver the reliability breakthrough needed to reclaim market leadership in the age of agentic AI?
The answer will not only determine OpenAI's competitive position but also the architecture of autonomous systems that enterprises trust with their mission-critical workflows. Anticipation is building, with industry sources confirming OpenAI's preparations for a full rollout, including mini and nano versions alongside the flagship model.
Defining Discontinuity: The MCP Standard.
A discontinuity represents paradigm-shifting breakthroughs that unlock new capabilities, ecosystems, or economics—not just better benchmarks, but systemic change that enables compound growth in adjacent markets. Building on the orchestration framework from my earlier analysis, where I argued that distribution and efficiency now outpace raw model power, consider the impact of MCP: before this protocol, AI agents were brittle parlor tricks; after, they became the scaffolding for billion-dollar productivity tools like Cursor and Windsurf.
The MCP Standard's success in solving the orchestration problem and standardizing how models invoke tools and resources in context, has created what I've termed the 'invisible OS' of the agentic era. This 'invisible OS' allows AI systems to coordinate seamlessly across applications without users being aware of the underlying complexity, thereby enhancing the user experience. The protocol not only improved agent performance but also made agent ecosystems economically viable, as evidenced by Anthropic's 42% code generation market share in Menlo Ventures' 2025 update.
To illustrate the practical leap in code generation, take the "pelican riding a bicycle" SVG benchmark by Simon Willison popularized on Hacker News. This prompt—"Generate an SVG of a pelican riding a bicycle"—tests a model's ability to produce detailed, functional vector graphics code from scratch.
https://simonwillison.net/2025/Jun/6/six-months-in-llms/ [Link towards an AIE presentation]
Early LLMs struggled, often outputting simplistic or broken SVGs; however, advancements in orchestration, such as MCP, have enabled models to handle such tasks with increasing fidelity. For instance, recent Claude iterations generate intricate, rideable pelican designs complete with pedals and feathers, showcasing how standardized tool invocation turns creative prompts into production-ready code. This example underscores discontinuity: not just better outputs, but unlocking new creative and developer workflows that compound innovation. This is discontinuity: infrastructure that enables compounding innovation rather than linear improvement.
The metric isn't better scores on academic benchmarks—it's whether the technology unlocks entirely new categories of economic activity. With this framework established, GPT-5's discontinuity potential becomes measurable against concrete criteria rather than marketing hyperbole.
GPT-5's Discontinuity Potential: The Reinforcement Learning Thesis.
Based on recent reports and OpenAI's leaked strategic memo, GPT-5 appears to be positioned as an evolutionary leap with discontinuity potential, hinging on two critical innovations: advanced reinforcement learning and a unified multimodal architecture. These features aim to address core limitations in current LLMs, which are increasingly hitting performance plateaus under the transformer paradigm—where simply scaling data and compute yields diminishing returns, as evidenced by recent research showing further breakthroughs as computationally intractable and energy-intensive (up to 100 million times more than the human brain).
GPT-5's breakthrough centers on what OpenAI calls a "universal verifier"—an auxiliary AI system that evaluates every output during training, approving only responses that meet rigorous quality standards. As detailed in The Information's July 31, 2025, exposé on OpenAI's "rocky path," this verifier addresses data shortages and slowing gains by extending reinforcement learning beyond verifiable domains, such as code, into general reasoning tasks. This addresses a critical scaling constraint: while competitors hit data quality plateaus, OpenAI's approach creates self-improving systems that compound their capabilities through verified reasoning chains.
RL is a potential game changer here because it shifts from passive learning (memorizing patterns from vast datasets) to active, trial-and-error refinement with rewards for accuracy, allowing the model to iteratively improve without relying solely on more data or compute. This could break through the LLM plateau by enabling longer, more reliable chain-of-thought reasoning—where models learn to decompose problems into verifiable steps, reducing errors and hallucinations that plague current systems.
The technical implications are profound—current agents fail because they can't reliably decompose complex tasks into verifiable sub-steps. If GPT-5's RL system can provide real-time verification of intermediate reasoning states, it crosses the threshold from "impressive but brittle" to "reliable enough to trust with critical workflows." Applying this to the pelican SVG example, RL could enable models to iteratively refine outputs iteratively, ensuring the bicycle is proportionally balanced and the pelican's posture is anatomically plausible, far surpassing static generation.
Recent leaks from an anonymous source with "reasonable proof" further detail GPT-5's technical edge, including support for dynamic short- and long-term reasoning, Code Interpreter tools, and massive context windows of up to 1 million tokens for input and 100,000 for output, enhancing its ability to handle complex, multi-step code generation tasks.
However, industry skepticism is mounting. X discussions reveal concerns that direct agentic RL may not naturally extend chain-of-thought length or yield breakthrough moments, while Yann LeCun advocates abandoning RL for AGI entirely. More concerning, leaks suggest that o3's reasoning prowess "collapsed" when adapted to chat modes, indicating that the transition from research to production remains challenging. This suggests that while GPT-5's RL gambit aims to break through, it may face architectural limitations, prompting the industry to explore alternatives such as insight-driven models or hybrid paradigms.
GPT -5 will integrate reasoning capabilities with multimodal support for text, images, and voice into a single system, eliminating the need for context switching between specialized models. Multimodality matters because it enables AI to process and generate content across various formats seamlessly—e.g., analyzing an image, transcribing voice input, and outputting reasoned text in one continuous flow—which is essential for real-world agentic tasks like planning a vacation (combining visual search, voice queries, and textual planning).
This unified approach could represent discontinuity if it enables seamless agent orchestration across modalities. OpenAI researcher Noam Brown's suggestion that scale could "wash away" fancy scaffolds implies GPT-5's out-of-the-box performance might replace complex agentic systems entirely. Intriguingly, GPT-5 is reported to support Anthropic's own MCP for external data connections and parallel tool calls, potentially allowing it to interoperate with or even outperform Claude in orchestrated workflows.
The leaked OpenAI memo reveals the strategic logic at play: creating a "super-assistant" with "T-shaped skills" - broad capabilities for daily tasks combined with deep expertise in specific domains. As the memo states, "It's an intelligent entity with T-shaped skills... broad skills for daily tasks that are tedious, and deep expertise for tasks that most people find impossible (starting with coding)."
Yet competitive dynamics paint a sobering picture. OpenAI's market position erosion is stark: from 50% to 25% enterprise share in 18 months, while losing ground in the critical code generation segment (21% vs. Anthropic's 42%). This matters because coders drive agentic adoption—if developers prefer Anthropic's agent-first approach over OpenAI's traditional chat interface, GPT-5 faces an uphill battle regardless of technical superiority.
Recent developments underscore OpenAI's intent to go head-to-head with Claude in code generation. Anthropic revoked OpenAI's access to the Claude API after discovering that OpenAI engineers were using Claude's coding tools—possibly for benchmarking or training GPT-5—highlighting the fierce rivalry in this space, where Claude currently leads in coding benchmarks. Still, GPT-5 is expected to surpass Claude Sonnet 4 in performance. The massive context windows (up to 1 million input tokens) are equally crucial: they allow GPT-5 to maintain coherence over extended interactions, such as processing entire codebases, long documents, or long multi-turn conversations without losing context, which is vital for agentic reliability in production environments where tasks span thousands of steps.
Durable Growth Moat Analysis: The Reinforcement Learning Flywheel for GPT-5 to Rebuild OpenAI's Competitive Moat.
To rebuild OpenAI's competitive moat, GPT -5 must create sustainable advantages that compound over time. The reinforcement learning approach offers two potential moat-building mechanisms, though each faces significant execution risks.
Personalization, Lock-in, and Ecosystem Integration: RL-driven systems that learn from user interactions can create powerful data flywheels.
Unlike traditional models that remain static post-training, GPT-5's ability to improve through reinforcement could enable personalized agents that "know you" better over time, creating switching costs as users invest in teaching their AI assistant their preferences and workflows.
The leaked memo emphasizes this, noting that the super-assistant "knows you, understands what you care about, and helps with any task," with a "fast, efficient data flywheel" at its heart. Simultaneously, GPT-5 will incorporate voice, canvas, search, and deep research into a unified system, potentially creating integration advantages. The leaked memo outlines plans for ChatGPT to become the "primary entry point for the super-assistant," with third-party surfaces like Siri acting as conduits rather than competitors. This dual approach—personalization plus integration—represents OpenAI's bet that proprietary RL infrastructure can overcome Anthropic's head start in agent orchestration.
However, the evidence suggests GPT-5 could create only a partial moat. Anthropic continues to expand MCP adoption, while Google and Meta push open-source alternatives that could commoditize the underlying technology. OpenAI's plan for an open-weights model alongside GPT-5 acknowledges this competitive pressure. The moat will emerge only if GPT-5 demonstrably outperforms in workflows that matter most to enterprise customers—primarily code generation and agent orchestration. OpenAI's leaked memo projects optimism about recapturing market share, but the data suggests they're fighting for position rather than defending dominance.
Ecosystem and Market Implications
If GPT-5 delivers on its potential for discontinuity, the broader implications could reshape the AI industry's structure. A successful launch would accelerate the shift toward agentic computing platforms, providing startups with more reliable orchestration capabilities and potentially unlocking new categories of productivity tools and autonomous systems.
However, this cuts both ways. If GPT-5's unified approach succeeds, it could trigger consolidation in the agent tooling space, with specialized orchestration platforms becoming redundant and concentrating value in OpenAI's stack. Conversely, failure could accelerate the shift toward open-source alternatives, thereby altering the economics of the AI industry.
The defining characteristic of the agentic era—the shift from human-AI collaboration to AI-AI orchestration—will largely be determined by whether this transition occurs through OpenAI's ecosystem or competitors like Anthropic and Google. Enterprise buyers increasingly prioritize platform consolidation and reliable automation over cutting-edge features, making execution more critical than innovation. OpenAI's strategic shift toward open-weight models, alongside proprietary capabilities, represents a recognition that pure closed-source strategies may not be sustainable. The RL capabilities in GPT-5 represent a hedge—even if base model capabilities become commoditized, the reinforcement learning infrastructure and personalization data could remain differentiated.
From a business model perspective, the launch of GPT-5 could significantly impact OpenAI's monetization strategies. As of July 2025, OpenAI has reached $12 billion in annualized revenue, primarily driven by ChatGPT subscriptions and API usage, with over 700 million weekly active users on ChatGPT, according to The Information's recent reporting.
Subscriptions, such as ChatGPT Plus and Team tiers, account for a substantial portion, with enterprise adoption skyrocketing—ChatGPT Enterprise now boasts over 600,000 users. However, cash burn is projected at around $8 billion for 2025, underscoring the need for efficient scaling. Initially, GPT-5 is likely to drive monetization through existing subscribers by offering exclusive access to advanced RL features, boosting retention, upsell to higher tiers (e.g., from Plus to Pro), and increased usage within limits—potentially more than attracting net new subscribers in the short term, as the hype builds awareness but reliability proves value for upgrades.
Over time, if GPT-5 serves as the backbone for agentic workflows, API revenue could surge as enterprises integrate it for mission-critical automation, expanding B2B deals, and creating recurring, high-margin streams beyond consumer subscriptions.
Signal or Noise: The Critical Assessment
GPT -5 represents a definitive test of whether proprietary reinforcement learning can overcome the advantages of an open ecosystem. Three factors will determine whether this represents a genuine discontinuity or sophisticated market positioning:
Developer adoption patterns will provide the most reliable signal. If GPT-5 captures a significant market share in code generation within 90 days of its launch, it indicates a genuine technical advancement. Continued developer preference for Anthropic's MCP-enabled tools would suggest GPT-5's improvements are insufficient to overcome competitive disadvantages.
Reasoning reliability in production environments matters more than academic benchmarks. The test is whether enterprises trust GPT-5 for mission-critical automation—a threshold that requires not just better performance but demonstrably superior reliability in real-world agent tasks.
Economic viability expansion represents the ultimate metric of discontinuity. True breakthrough technology creates new economic opportunities rather than incremental improvements. GPT-5's success should be measured by whether it enables businesses to automate workflows that previously required human oversight, thereby expanding the total addressable market for AI applications.
Conclusion: The Reinforcement Learning Gambit
GPT-5 represents OpenAI's most ambitious bet yet - that reinforcement learning can create the reliable agentic foundation needed to rebuild competitive advantage in an increasingly crowded market. The technical approach shows promise: RL-driven reasoning could address the reliability issues that limit current agent adoption, while a unified multimodal architecture addresses enterprise demands for platform consolidation.
However, discontinuity isn't determined by technical specifications but by market adoption. OpenAI faces the challenge of leapfrogging competitors who have spent the past year building agent-first ecosystems while OpenAI has focused on chat interfaces. The leaked memo reveals a company aware of these challenges, betting that superior RL capabilities and unified architecture can overcome first-mover disadvantages in the agentic space.
For enterprise technology leaders, the launch of GPT-5 will provide crucial insights into whether closed-source innovation can compete with open ecosystems, whether RL represents a sustainable method of differentiation, and whether unified platforms can outcompete specialized tools in the age of agentic computing.
The model that captures the coder ecosystem will define the infrastructure layer of the agentic era—and with it, the next decade of enterprise AI architecture. The answer will emerge not from benchmark scores, but from the choices developers and enterprises make when building the autonomous systems that will define the next phase of AI adoption. In the agentic era, discontinuity belongs to whoever the builders choose to build upon.
What do Readers think about GPT-5? 🔬
Addendum - Here is a Collection of articles on GPT-5 to read:
(In no particular order).
How AI, Healthcare, and Labubu Became the American Economy
GPT-5's Router: how it works and why Frontier Labs are now targeting the Pareto Frontier
GPT-5's Vision Checkup: a frontier VLM, but not a new SOTA
GPT-5, OpenAI now dominates the intelligence per dollar frontier for the first time.
According to
.Reality: OpenAI usage plummets as Students check out
Via
Is GPT-5 Thinking a Frontier Push Ahead?
OpenAI claims that GPT‑5 (with thinking) performs better than OpenAI o3 with 50-80% less output tokens across capabilities, including visual reasoning, agentic coding, and graduate-level scientific problem solving.
Just a week ago Google had announced Deep Think is now live in the Gemini app.
Is Generative AI a Ponzi Scheme of the Elite?
Are Datacenter build-outs and GPU capex a manifestation of a global concentration of power?
The U.S. economy depends in an over-sized way on consumptions by the higher socio-economic classes.
GPT-5 on ARC-AGI-1 Test
On August 8th, 2025 there were dozens of very negative responses on GPT-5 in both Reddit and on X.
ChatGPT Plus before vs after the GPT-5 release 😂
Users are not happy: (One model to rule them all, unity). The usage limit scandal. (see comments on this tweet)
This was pretty much consensus on X, Reddit, etc…
More from Reddit and X on the GPT-5 Launch
ChatGPT users are not happy with GPT-5 launch (read the comments on this)
A bit sad how the GPT-5 launch is going so far…
GPT-5 is the biggest —— even as a paid user. (There is an endless number of these types of comments, just want to say that it’s the norm and not the exception and I guess the media will be all over it).
Is it just me or is ChatGPT with GPT-5 much slower?
To find articles on GPT-5 on Substack, because Substack search is not very good.
Type "GPT-5 site:substack.com" in a Search Engine.