“Visual” artificial intelligence (AI) is everywhere—from sorting our photos and identifying flowers to steering our cars. Yet, these powerful systems don’t always “see” the world as we do, sometimes behaving in unexpected ways. For instance, an AI that can distinguish hundreds of car models might still miss the commonality between a car and an airplane, such as both being large metal vehicles. To bridge this gap, new research introduces a method to reorganize AI’s visual representations, making it more helpful, robust, and reliable. This work, published in Nature, marks a step toward building intuitive and trustworthy AI systems.
Why do AI systems struggle with tasks like identifying the “odd one out”? When you see a cat, your brain forms a mental representation covering everything from its color to its “cat-ness.” Similarly, AI vision models map images into a high-dimensional space, grouping similar items close together. To compare human and AI perceptions, researchers used the classic “odd-one-out” task from cognitive science, where participants pick the least fitting image among three. Sometimes, both humans and AI agree—like selecting a birthday cake as the outlier among a tapir and a sheep. But in other cases, they diverge. For example, while humans consistently choose a starfish as the odd one out in a set, AI models often fixate on superficial features like background color and pick a cat instead.
This misalignment is systematic across various vision models. A two-dimensional projection of an AI’s internal map reveals the issue: before alignment, representations for categories like animals, food, and furniture are jumbled together. After applying the new method, the map becomes clearly organized, with distinct clusters for each category.
To achieve this alignment, researchers developed a multi-step approach. Cognitive scientists have compiled the THINGS dataset with millions of human “odd-one-out” judgments, but it only includes a few thousand images—insufficient for fine-tuning powerful models without overfitting. The solution involves three steps: First, a pretrained vision model (SigLIP-SO400M) is enhanced with a small adapter trained on THINGS, creating a “teacher” model that retains prior skills. This teacher then generates AligNet, a massive dataset of human-like decisions using a million images. Finally, other AI models (“students”) are fine-tuned with AligNet, allowing them to restructure their internal maps without overfitting. The result is a transformation from chaos to order, where high-level concepts like animals (blue) and food (green) are neatly separated.
This reorganization mirrors the hierarchical structure of human knowledge. During alignment, representations shift based on “conceptual distance”—similar items, like two dogs, move closer, while dissimilar ones, like an owl and a truck, move apart. A line graph illustrates these changes, showing how relative distances adjust to match human perceptions.
Testing the aligned models on cognitive science tasks, such as multi-arrangement and a new odd-one-out dataset called Levels, reveals dramatic improvements. The models agree much more often with human judgments and even exhibit “human-like” uncertainty, where their decision uncertainty correlates with the time humans take to choose. Moreover, alignment enhances overall performance in AI tasks, including few-shot learning (mastering a new category from one image) and handling distribution shifts (adapting to changed image types). Bar graphs demonstrate that aligned models (dark blue) outperform original ones (light gray) in both cognitive and AI challenges.
In conclusion, this research offers a pathway to more human-aligned and reliable AI models. By addressing the gap in how AI organizes visual information, the method not only improves alignment with human judgments but also boosts robustness in standard tasks. While further work is needed, this represents a significant stride toward creating AI systems that see the world more like we do.