OpenAI has announced new GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano models for its API. Now available to developers, these models outperform GPT‑4o and GPT‑4o mini across the board.
In particular, the models are said to provide major gains in coding and instruction following. All three also get a significant context window boost, now handling up to 1 million tokens, and OpenAI claims they're better at actually using that larger context effectively. Their knowledge base has been updated through June 2024.
Looking at the specifics for the main GPT-4.1 model, OpenAI shared some benchmark numbers. They reported a 54.6% score on SWE-bench Verified for coding, calling it a "21.4%abs improvement over GPT-4o." For instruction following, measured by Scale's MultiChallenge benchmark, it hit 38.3%, a gain of 10.5% absolute over GPT-4o. The model apparently also set a new high score for long-context understanding on the Video-MME benchmark.
While the benchmark numbers give some idea, OpenAI also stressed that these models were built with practical use cases in mind, shaped by developer feedback. The aim was to deliver this better performance while also cutting down on cost and latency.
The GPT-4.1 mini variant is positioned as a big step for smaller models, supposedly beating even the full-size GPT-4o in some tests. OpenAI claims it delivers similar or better intelligence scores than GPT-4o while being almost twice as fast and costing 83% less to run.
The third new model, GPT-4.1 nano, targets situations where speed and low cost matter most. OpenAI calls it their fastest and cheapest option, yet it still handles the same 1 million token context window as the bigger models. Despite its smaller footprint, OpenAI reports good scores on evaluations like MMLU and GPQA, claiming it even beats GPT-4o mini in some respects. This makes it a solid candidate for tasks needing quick turnaround, like text classification or driving autocompletion features.
These upgrades in reliability and handling long context are also meant to make the GPT-4.1 models better suited for building AI agents.
One important detail: these new GPT-4.1 models are API-only for now. OpenAI did say that many of the underlying improvements have already been trickling into the version of GPT-4o used in ChatGPT, and more updates will follow.
Alongside this news, OpenAI is starting the process of retiring the GPT-4.5 Preview model from its API. This is because GPT-4.1 generally matches or beats GPT-4.5 Preview capabilities at a better price and speed. Developers have until July 14, 2025 to switch over.
More details and benchmarks on these 4.1 models can be found in the full announcement linked below...