A recent study has revealed that artificial intelligence (AI) systems struggle with two basic skills that humans can perform easily: reading an analogue clock and figuring out the day on which a date will fall.
Researchers tested various multimodal large language models (LLMs), including Meta’s Llama 3.2-Vision, Anthropic’s Claude-3.5 Sonnet, Google’s Gemini 2.0, and OpenAI’s GPT-4o, with custom datasets of clock and calendar images. The results showed that the AI systems were unable to identify the correct time from a clock image or determine the day of the week for a sample date more than half the time.
According to study lead author Rohit Saxena, this is due to the fact that AI systems rely on spatial reasoning and detecting patterns in training data, rather than running arithmetic algorithms. While AI can excel at tasks with abundant examples, it struggles with generalization and abstract reasoning, particularly when faced with rare or unfamiliar phenomena like leap years.
The study highlights the need for more targeted examples in training data and a rethinking of how AI handles logical and spatial reasoning. It also emphasizes the importance of rigorous testing, fallback logic, and human oversight in tasks that mix perception with precise reasoning.
Saxena noted that while AI is powerful, its limitations should not be underestimated. “AI is powerful, but when tasks mix perception with precise reasoning, we still need rigorous testing, fallback logic, and in many cases, a human in the loop,” he said.
Source: https://www.livescience.com/technology/artificial-intelligence/ai-models-cant-tell-time-or-read-a-calendar-study-reveals