GPT-5.5 Arrives In ChatGPT And Codex

4/24/2026
OpenAI introduced GPT-5.5 on April 23, 2026, describing it as a new class of intelligence for real work. The company says the model is designed to understand tasks earlier, use tools more effectively, check its own work, and keep going until a complex job is complete. In practical terms, OpenAI positions GPT-5.5 around work that moves across coding, online research, data analysis, documents, spreadsheets, software operation, and tool-based workflows. Instead of requiring users to manage every step, the model is presented as better suited to messy, multi-part tasks that need planning and execution over time. A major focus of the release is agentic coding. OpenAI says GPT-5.5 is its strongest agentic coding model so far. On Terminal-Bench 2.0, which measures complex command-line workflows involving planning, iteration, and tool coordination, the model reached 82.7% accuracy. On SWE-Bench Pro, which evaluates real-world GitHub issue resolution, GPT-5.5 reached 58.6%. It also outperformed GPT-5.4 on Expert-SWE, OpenAI’s internal evaluation for long-horizon coding tasks with a median estimated human completion time of 20 hours. In Codex, the model is being positioned for engineering work that includes implementation, refactoring, debugging, testing, and validation. https://pbs.twimg.com/media/HGm8jVWbsAAwL60?format=png&name=900x900 OpenAI links these gains to behaviors that matter in real engineering work. According to the announcement, GPT-5.5 is better at holding context across large systems, reasoning through ambiguous failures, using tools to check assumptions, and carrying changes through the surrounding codebase. The company also says GPT-5.5 improves on GPT-5.4 across the three coding evaluations while using fewer tokens. That efficiency is part of the broader message of the launch: OpenAI says GPT-5.5 matches GPT-5.4 per-token latency in real-world serving while performing at a higher intelligence level. The model is also being introduced as a tool for everyday knowledge work on computers. OpenAI says GPT-5.5 moves more naturally through the full loop of finding information, understanding what matters, using tools, checking output, and turning raw material into usable work. In the published evaluation tables, GPT-5.5 scored 84.9% on GDPval, 78.7% on OSWorld-Verified, and 98.0% on Tau2-bench Telecom without prompt tuning. It also reached 60.0% on FinanceAgent, 88.5% on internal investment-banking modeling tasks, and 54.1% on OfficeQA Pro. In Codex, OpenAI says the model is stronger than GPT-5.4 at generating documents, spreadsheets, and slide presentations. GPT-5.5 Pro is positioned for harder and higher-accuracy work in ChatGPT. OpenAI says early testers found its responses more comprehensive, better structured, more accurate, more relevant, and more useful compared with GPT-5.4 Pro, with especially strong feedback in business, legal, education, and data science. GPT-5.5 Thinking is rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT. GPT-5.5 Pro is rolling out to Pro, Business, and Enterprise users. In Codex, GPT-5.5 is available for Plus, Pro, Business, Enterprise, Edu, and Go plans with a 400K context window. Fast mode is also available, generating tokens 1.5 times faster at 2.5 times the cost. Scientific and technical research is another major section of the release. OpenAI says GPT-5.5 is better at persisting through research workflows that require exploring an idea, gathering evidence, testing assumptions, interpreting results, and deciding what to try next. The company reports a clear improvement over GPT-5.4 on GeneBench, which focuses on multi-stage scientific data analysis in genetics and quantitative biology. On BixBench, a benchmark centered on real-world bioinformatics and data analysis, OpenAI says GPT-5.5 achieved leading performance among models with published scores. The announcement also describes an internal GPT-5.5 version with a custom harness helping discover a new proof related to Ramsey numbers, later verified in Lean. OpenAI is also emphasizing safety and staged deployment. The company says GPT-5.5 was evaluated through its safety and preparedness frameworks, tested with internal and external red-teamers, and given targeted evaluations for advanced cybersecurity and biology capabilities. OpenAI says it is treating GPT-5.5’s biological, chemical, and cybersecurity capabilities as High under its Preparedness Framework, while noting that the model did not reach Critical cybersecurity capability. API availability is not live yet, but OpenAI says GPT-5.5 and GPT-5.5 Pro will come to the API soon. Planned API pricing lists gpt-5.5 at $5 per 1 million input tokens and $30 per 1 million output tokens, while gpt-5.5-pro is planned at $30 per 1 million input tokens and $180 per 1 million output tokens.