Model Watch · Reviewed

OpenAI GPT-5.2

Announced Dec 11, 2025Released Dec 11, 2025Reviewed Jun 23, 2026

What they claimed

OpenAI positioned GPT-5.2 as its most advanced frontier model for professional work and long-running agents — better at spreadsheets, presentations, coding, image perception, long-context understanding, and complex multi-step projects, with a 400,000-token context window and an August 31, 2025 knowledge cutoff. The standout claim: GPT-5.2 Thinking is OpenAI's first model to perform at or above human expert level on GDPval, a benchmark of well-specified knowledge work across 44 occupations, beating or tying industry professionals on 70.9% of comparisons. It arrived less than a month after GPT-5.1 (November 12, 2025) and amid an internal 'code red' to counter Google's Gemini 3.

What shipped

GPT-5.2 launched December 11, 2025 in three variants — Instant, Thinking, and Pro — rolling out in ChatGPT starting with paid plans and available to all developers in the API on day one.

The verdict

This is a very recent release, so the verdict is provisional. The competitive backdrop is the most telling fact: GPT-5.2 shipped fast, on the heels of GPT-5.1, under a self-described 'code red' triggered by Google's Gemini 3 — a cadence that signals defensive urgency as much as a planned roadmap. The GDPval 'expert-level' claim is OpenAI's own benchmark and should be read as a marketing framing until independent evaluations and real-world enterprise use confirm it; 'ties or beats professionals 70.9% of the time' on curated tasks is not the same as reliably doing your staff's jobs. The genuinely useful signals for executives are the boring ones — a 400K context window, three latency/quality tiers, and same-day API access — which make it deployable now. Treat the headline as a hypothesis and pilot before you believe it.

Why it matters

GPT-5.2 shows the frontier has become a rapid, head-to-head race with Google where vendors ship on competitive reflex, so executives should anchor on independently verified capability and fit, not launch-day 'expert-level' claims.

Sources

All tracked models