Related Articles
Visual cognition in multimodal large language models
A chief goal of artificial intelligence is to build machines that think like people. Yet it has been argued that deep neural network architectures fail to accomplish this. Researchers have asserted these models’ limitations in the domains of causal reasoning, intuitive physics and intuitive psychology. Yet recent advancements, namely the rise of large language models, particularly those designed for visual processing, have rekindled interest in the potential to emulate human-like cognitive abilities. This paper evaluates the current state of vision-based large language models in the domains of intuitive physics, causal reasoning and intuitive psychology. Through a series of controlled experiments, we investigate the extent to which these modern models grasp complex physical interactions, causal relationships and intuitive understanding of others’ preferences. Our findings reveal that, while some of these models demonstrate a notable proficiency in processing and interpreting visual data, they still fall short of human capabilities in these areas. Our results emphasize the need for integrating more robust mechanisms for understanding causality, physical dynamics and social cognition into modern-day, vision-based language models, and point out the importance of cognitively inspired benchmarks.
Semantic embeddings reveal and address taxonomic incommensurability in psychological measurement
Taxonomic incommensurability denotes the difficulty in comparing scientific theories due to different uses of concepts and operationalizations. To tackle this problem in psychology, here we use language models to obtain semantic embeddings representing psychometric items, scales and construct labels in a vector space. This approach allows us to analyse different datasets (for example, the International Personality Item Pool) spanning thousands of items and hundreds of scales and constructs and show that embeddings can be used to predict empirical relations between measures, automatically detect taxonomic fallacies and suggest more parsimonious taxonomies. These findings suggest that semantic embeddings constitute a powerful tool for tackling taxonomic incommensurability in the psychological sciences.
A manifesto for a globally diverse, equitable, and inclusive open science
The field of psychology has rapidly transformed its open science practices in recent years. Yet there has been limited progress in integrating principles of diversity, equity and inclusion. In this Perspective, we raise the spectre of Questionable Generalisability Practices and the issue of MASKing (Making Assumptions based on Skewed Knowledge), calling for more responsible practices in generalising study findings and co-authorship to promote global equity in knowledge production. To drive change, researchers must target all four key components of the research process: design, reporting, generalisation, and evaluation. Additionally, macro-level geopolitical factors must be considered to move towards a robust behavioural science that is truly inclusive, representing the voices and experiences of the majority world (i.e., low-and-middle-income countries).
Preserving and combining knowledge in robotic lifelong reinforcement learning
Humans can continually accumulate knowledge and develop increasingly complex behaviours and skills throughout their lives, which is a capability known as ‘lifelong learning’. Although this lifelong learning capability is considered an essential mechanism that makes up general intelligence, recent advancements in artificial intelligence predominantly excel in narrow, specialized domains and generally lack this lifelong learning capability. Here we introduce a robotic lifelong reinforcement learning framework that addresses this gap by developing a knowledge space inspired by the Bayesian non-parametric domain. In addition, we enhance the agent’s semantic understanding of tasks by integrating language embeddings into the framework. Our proposed embodied agent can consistently accumulate knowledge from a continuous stream of one-time feeding tasks. Furthermore, our agent can tackle challenging real-world long-horizon tasks by combining and reapplying its acquired knowledge from the original tasks stream. The proposed framework advances our understanding of the robotic lifelong learning process and may inspire the development of more broadly applicable intelligence.
Two types of motifs enhance human recall and generalization of long sequences
Whether it is listening to a piece of music, learning a new language, or solving a mathematical equation, people often acquire abstract notions in the sense of motifs and variables—manifested in musical themes, grammatical categories, or mathematical symbols. How do we create abstract representations of sequences? Are these abstract representations useful for memory recall? In addition to learning transition probabilities, chunking, and tracking ordinal positions, we propose that humans also use abstractions to arrive at efficient representations of sequences. We propose and study two abstraction categories: projectional motifs and variable motifs. Projectional motifs find a common theme underlying distinct sequence instances. Variable motifs contain symbols representing sequence entities that can change. In two sequence recall experiments, we train participants to remember sequences with projectional and variable motifs, respectively, and examine whether motif training benefits the recall of novel sequences sharing the same motif. Our result suggests that training projectional and variables motifs improve transfer recall accuracy, relative to control groups. We show that a model that chunks sequences in an abstract motif space may learn and transfer more efficiently, compared to models that learn chunks or associations on a superficial level. Our study suggests that humans construct efficient sequential memory representations according to the two types of abstraction we propose, and creating these abstractions benefits learning and out-of-distribution generalization. Our study paves the way for a deeper understanding of human abstraction learning and generalization.
Responses