OpenAI has released its most advanced AI model yet, called o1, with limited access available for paying users. The company touted the model’s “complex reasoning” capabilities, but a recent demo showcasing the model’s strengths revealed it to be almost useless.
The demo involved asking the model to provide instructions on how to build a wooden birdhouse. While the AI provided some general guidance, including dimensions and materials needed, the instructions were fragmented and lacked clarity. The model failed to account for basic details, such as the need for a hammer or a hinge.
James Filus, director of the Institute of Carpenters, expressed skepticism about the model’s effectiveness in real-world applications. “You would know just as much about building the birdhouse from the image as you would the text, which kind of defeats the whole purpose of the AI tool,” he said.
This is not an isolated incident. Last year, a Google advert mistakenly claimed that the James Webb telescope had made a discovery it hadn’t, and more recently, an updated Google tool advised users to eat rocks. These examples highlight the limitations of AI in real-world applications.
o1’s approach differs from ChatGPT’s immediate response mechanism. Instead, o1 uses a “chain of thought” reasoning technique that takes time to process answers. While this approach has yielded accurate results for complex questions, it also reveals potential pitfalls in basic logical errors.
The model’s performance on PhD-level science questions is impressive, with an accuracy rate of 78%. However, the failure to deliver on its promised complex reasoning capabilities raises concerns about its reliability and usability.
Source: https://time.com/7200289/openai-demo-shows-reasoning-ai-making-basic-errors