Related Articles

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations

This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model’s speech embeddings, and higher-level language areas better align with the model’s language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.

Language measures correlate with other measures used to study emotion

Researchers are increasingly using language measures to study emotion, yet less is known about whether language relates to other measures often used to study emotion. Building on previous work which focuses on associations between language and self-report, we test associations between language and a broader range of measures (self-report, observer report, facial cues, vocal cues). Furthermore, we examine associations across different dictionaries (LIWC-22, NRC, Lexical Suite, ANEW, VADER) used to estimate valence (i.e., positive versus negative emotion) or discrete emotions (i.e., anger, fear, sadness) in language. Associations were tested in three large, multimodal datasets (Ns = 193–1856; average word count = 316.7–2782.8). Language consistently related to observer report and consistently related to self-report in two of the three datasets. Statistically significant associations between language and facial cues emerged for language measures of valence but not for language measures of discrete emotions. Language did not consistently show significant associations with vocal cues. Results did not tend to significantly vary across dictionaries. The current research suggests that language measures (in particular, language measures of valence) are correlated with a range of other measures used to study emotion. Therefore, researchers may wish to use language to study emotion when other measures are unavailable or impractical for their research question.

Predictive learning as the basis of the testing effect

A prominent learning phenomenon is the testing effect, meaning that testing enhances retention more than studying. Emergent frameworks propose fundamental (Hebbian and predictive) learning principles as its basis. Predictive learning posits that learning occurs based on the contrast (error) between a prediction and the feedback on that prediction (prediction error). Here, we propose that in testing (but not studying) scenarios, participants predict potential answers, and its contrast with the subsequent feedback yields a prediction error, which facilitates testing-based learning. To investigate this, we developed an associative memory network incorporating Hebbian and/or predictive learning, together with an experimental design where human participants studied or tested English-Swahili word pairs followed by recognition. Three behavioral experiments (N = 80, 81, 62) showed robust testing effects when feedback was provided. Model fitting (of 10 different models) suggested that only models incorporating predictive learning can account for the breadth of data associated with the testing effect. Our data and model suggest that predictive learning underlies the testing effect.

Preserving and combining knowledge in robotic lifelong reinforcement learning

Humans can continually accumulate knowledge and develop increasingly complex behaviours and skills throughout their lives, which is a capability known as ‘lifelong learning’. Although this lifelong learning capability is considered an essential mechanism that makes up general intelligence, recent advancements in artificial intelligence predominantly excel in narrow, specialized domains and generally lack this lifelong learning capability. Here we introduce a robotic lifelong reinforcement learning framework that addresses this gap by developing a knowledge space inspired by the Bayesian non-parametric domain. In addition, we enhance the agent’s semantic understanding of tasks by integrating language embeddings into the framework. Our proposed embodied agent can consistently accumulate knowledge from a continuous stream of one-time feeding tasks. Furthermore, our agent can tackle challenging real-world long-horizon tasks by combining and reapplying its acquired knowledge from the original tasks stream. The proposed framework advances our understanding of the robotic lifelong learning process and may inspire the development of more broadly applicable intelligence.

Understanding learning through uncertainty and bias

Learning allows humans and other animals to make predictions about the environment that facilitate adaptive behavior. Casting learning as predictive inference can shed light on normative cognitive mechanisms that improve predictions under uncertainty. Drawing on normative learning models, we illustrate how learning should be adjusted to different sources of uncertainty, including perceptual uncertainty, risk, and uncertainty due to environmental changes. Such models explain many hallmarks of human learning in terms of specific statistical considerations that come into play when updating predictions under uncertainty. However, humans also display systematic learning biases that deviate from normative models, as studied in computational psychiatry. Some biases can be explained as normative inference conditioned on inaccurate prior assumptions about the environment, while others reflect approximations to Bayesian inference aimed at reducing cognitive demands. These biases offer insights into cognitive mechanisms underlying learning and how they might go awry in psychiatric illness.

Responses

Your email address will not be published. Required fields are marked *