A recent study reveals that AI's portrayal of Neanderthals is outdated and inaccurate, highlighting a critical issue in the field of artificial intelligence and its relationship with historical knowledge. The research, led by Matthew Magnani and Jon Clindaniel, explores how AI systems reflect modern science versus outdated ideas when asked to depict ancient daily life.
The study, published in the journal Advances in Archaeological Practice, focuses on Neanderthals as a test case due to the long-standing debate surrounding their portrayal in scientific literature. Early depictions of Neanderthals were often hunched and primitive, while more recent research has revealed their cultural sophistication and physical diversity. This evolution in understanding makes Neanderthals an ideal subject to test AI's ability to adapt to changing scientific knowledge.
To conduct the study, Magnani and Clindaniel utilized two popular AI systems: DALL-E 3 for generating images and ChatGPT with the GPT-3.5 model for written text. They created various prompts, some focusing on Neanderthal life without scientific accuracy and others based on expert knowledge.
The results were striking. AI-generated images often depicted Neanderthals with heavy hunches, thick body hair, and ape-like features, reflecting outdated scientific ideas from over a century ago. These images also lacked women and children and primarily featured muscular adult males. Similarly, the written descriptions fell short, with about half not aligning with modern scholarly understanding.
Both the images and text mixed timelines, blending primitive bodies with advanced tools like basketry, ladders, glass, metal tools, and thatched roofs, which Neanderthals did not possess.
By comparing AI output with decades of archaeological writing, the researchers found that ChatGPT's text aligned most closely with early 1960s scholarship, while DALL-E 3's images matched late 1980s and early 1990s work.
The study highlights a significant challenge: older scientific research, often behind paywalls due to copyright rules, is more accessible to AI systems. This accessibility contributes to the AI's tendency to rely on older, more readily available ideas rather than current research.
The implications of this research extend beyond archaeology and anthropology. Generative AI is transforming how images, writing, and sound are created and trusted, empowering individuals without formal training to explore history and science. However, it also risks spreading old stereotypes and errors on a massive scale.
In archaeology and anthropology, public understanding is often shaped by images and stories. If these representations are inaccurate, misconceptions can become deeply ingrained. Neanderthals are just one example, but the same risks apply to various cultures and periods.
The study provides a template for researchers to examine the gap between scholarship and AI-generated content, emphasizing the importance of open access research to ensure AI reflects current knowledge. Additionally, it underscores the need for caution when using AI tools, especially in education and science communication, to prevent the distortion of learning.
The research also offers a method for testing AI accuracy across various fields, ensuring that AI technology supports learning rather than distorting it. As AI becomes more prevalent, tools like these will be crucial in maintaining the integrity of information.