News

The FrontierMath benchmark from Epoch AI tests generative models on difficult math problems. Find out how OpenAI’s o3 and ...
A discrepancy between first- and third-party benchmark results for OpenAI's o3 AI model is raising questions about the ...
Learn how OpenAI's o3 and o4 models are setting new standards in generative AI, empowering businesses, developers, and ...
Historically, each new generation of OpenAI's models has delivered incremental improvements in factual accuracy, with ...
OpenAI’s o3 model shows inflated benchmark results; real-world tests reflect performance far below initial FrontierMath ...
By OpenAI 's own testing, its newest reasoning models, o3 and o4 -mini, hallucinate significantly higher than o1.
The jump is so steep that it may be causing some to think that AI has become Skynet. According to a new EduBirdie survey, 25% ...
OpenAI's latest AI models tend to make things up — or "hallucinate" — substantially more than earlier versions.
OpenAI’s newest AI model, o3, is at the center of a growing controversy after third-party tests revealed performance significantly lower than the ...
OpenAI's new AI models are hallucinating more than their predecessor, as per an internal testing report released by the ...
Word to the wise, be careful about the images you post on social media. OpenAI's latest AI models, released last week, have ...
OpenAI released its latest o3 and o4-mini models last week, which can "reason" through uploaded images. This means it can ...