Related Articles

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations

This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model’s speech embeddings, and higher-level language areas better align with the model’s language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.

Language measures correlate with other measures used to study emotion

Researchers are increasingly using language measures to study emotion, yet less is known about whether language relates to other measures often used to study emotion. Building on previous work which focuses on associations between language and self-report, we test associations between language and a broader range of measures (self-report, observer report, facial cues, vocal cues). Furthermore, we examine associations across different dictionaries (LIWC-22, NRC, Lexical Suite, ANEW, VADER) used to estimate valence (i.e., positive versus negative emotion) or discrete emotions (i.e., anger, fear, sadness) in language. Associations were tested in three large, multimodal datasets (Ns = 193–1856; average word count = 316.7–2782.8). Language consistently related to observer report and consistently related to self-report in two of the three datasets. Statistically significant associations between language and facial cues emerged for language measures of valence but not for language measures of discrete emotions. Language did not consistently show significant associations with vocal cues. Results did not tend to significantly vary across dictionaries. The current research suggests that language measures (in particular, language measures of valence) are correlated with a range of other measures used to study emotion. Therefore, researchers may wish to use language to study emotion when other measures are unavailable or impractical for their research question.

Bayesian p-curve mixture models as a tool to dissociate effect size and effect prevalence

Much research in the behavioral sciences aims to characterize the “typical” person. A statistically significant group-averaged effect size is often interpreted as evidence that the typical person shows an effect, but that is only true under certain distributional assumptions for which explicit evidence is rarely presented. Mean effect size varies with both within-participant effect size and population prevalence (proportion of population showing effect). Few studies consider how prevalence affects mean effect size estimates and existing estimators of prevalence are, conversely, confounded by uncertainty about effect size. We introduce a widely applicable Bayesian method, the p-curve mixture model, that jointly estimates prevalence and effect size by probabilistically clustering participant-level data based on their likelihood under a null distribution. Our approach, for which we provide a software tool, outperforms existing prevalence estimation methods when effect size is uncertain and is sensitive to differences in prevalence or effect size across groups or conditions.

Two types of motifs enhance human recall and generalization of long sequences

Whether it is listening to a piece of music, learning a new language, or solving a mathematical equation, people often acquire abstract notions in the sense of motifs and variables—manifested in musical themes, grammatical categories, or mathematical symbols. How do we create abstract representations of sequences? Are these abstract representations useful for memory recall? In addition to learning transition probabilities, chunking, and tracking ordinal positions, we propose that humans also use abstractions to arrive at efficient representations of sequences. We propose and study two abstraction categories: projectional motifs and variable motifs. Projectional motifs find a common theme underlying distinct sequence instances. Variable motifs contain symbols representing sequence entities that can change. In two sequence recall experiments, we train participants to remember sequences with projectional and variable motifs, respectively, and examine whether motif training benefits the recall of novel sequences sharing the same motif. Our result suggests that training projectional and variables motifs improve transfer recall accuracy, relative to control groups. We show that a model that chunks sequences in an abstract motif space may learn and transfer more efficiently, compared to models that learn chunks or associations on a superficial level. Our study suggests that humans construct efficient sequential memory representations according to the two types of abstraction we propose, and creating these abstractions benefits learning and out-of-distribution generalization. Our study paves the way for a deeper understanding of human abstraction learning and generalization.

Interracial contact shapes racial bias in the learning of person-knowledge

During impression formation, perceptual cues facilitate social categorization while person-knowledge can promote individuation and enhance person memory. Although there is extensive literature on the cross-race recognition deficit, observed when racial ingroup faces are recognized more than outgroup faces, it is unclear whether a similar deficit exists when recalling individuating information about outgroup members. To better understand how perceived race can bias person memory, the present study examined how self-identified White perceivers’ interracial contact impacts learning of perceptual cues and person-knowledge about perceived Black and White others over five sessions of training. While person-knowledge facilitated face recognition accuracy for low-contact perceivers, face recognition accuracy did not differ for high-contact perceivers based on person-knowledge availability. The results indicate a bias towards better recall of ingroup person knowledge, which decreased for high-contact perceivers across the five-day training but simultaneously increased for low-contact perceivers. Overall, the elimination of racial bias in recall of person-knowledge among high-contact perceivers amid a persistent cross-race deficit in face recognition suggests that contact may have a greater impact on the recall of person-knowledge than on face recognition.

Responses

Your email address will not be published. Required fields are marked *