Artificial intelligence (AI) model o3 is now more affordable and efficient than ever before, making it a viable option for daily coding tasks. In June, OpenAI slashed the price of its flagship reasoning model by roughly 80%, from $10 per million input tokens to $2 in / $8 out. This change has prompted API resellers to adjust their pricing, with Cursor now counting one o3 request as equivalent to a GPT-4o call.
The new pricing has also led to improved latency, with third-party dashboards reporting time-to-first-token (TTFT) times of 15-20 seconds for long prompts. While o3 is still slower than lightweight models, its recent upgrades have made it feel snappier in real-world use.
In contrast, Claude 4, a fast but sloppy model, has become the focus of attention due to its quick performance and large context window. However, users are finding that Claude’s speed often comes at the cost of follow-through, with the model inventing stubbed functions instead of real implementations.
o3, on the other hand, behaves in a more deliberate manner, producing code that actually compiles. With its new pricing, o3 is now within reach for developers who need reliable and efficient coding assistance. While subsidies from hardware leaps and capital strategies have helped reduce costs, rival models like BitNet b1.58 and Qwen3-235B-A22B are also gaining traction.
To make the most of o3, users should promote it to their main coder and planner, retain a lightweight fallback model, tame tool mania, prompt economically, watch latency spikes, and consider alternatives to heavyweights like Cursor and Windsurf. With its new pricing, o3 is now an attractive option for developers who need reliable and efficient coding assistance without breaking the bank.
Source: https://www.infoworld.com/article/4008535/openais-o3-price-plunge-changes-everything-for-vibe-coders.html