Three mechanisms of language comprehension are revealed through cluster analysis of individuals with language deficits

Introduction
For centuries, numerous scholars have considered the human ability for language as pivotal to the distinctive intelligence of our species. Unraveling the nature and evolution of human language remains a fundamental challenge for the fields of linguistics, cognitive neuroscience, and psychology. The discovery of three distinct mechanisms of language comprehension in our previous study contributes to the ongoing discussion1. That discovery emerged from clustering analysis of various language-comprehension abilities in 31,845 autistic individuals. When considering individuals with language deficits one at a time, their linguistic profiles may look different, but when many individuals are studied simultaneously, it becomes increasingly clear that certain abilities appear and disappear in concert (i.e., they co-occur). Clustering techniques automatically arrange abilities according to their similarities (i.e., co-occurrence). If any two linguistic abilities were mediated by the same underlying mechanism, then, when this mechanism is broken, both abilities should be absent and therefore, get clustered into the same group. For instance, consider three types of aphasia. 1) Damage to Wernicke’s area impairs understanding of word meanings and leads to fluid but nonsensical speech2. 2) Damage to Broca’s area causes difficulties in word formation (non-fluent speech) and results in the repetition of simple phrases over and over, while comprehension of individual words remains intact3. 3) Damage to the arcuate fasciculus white-matter tract, which connects the lateral prefrontal cortex (LPFC, located in the front of the cerebral cortex) with the posterior cortex (the back of the cortex, including the temporal, occipital, and parietal cortices), can disrupt understanding of spatial prepositions, flexible syntax, verb tense, and complex explanations, but does not affect fluent speech or understanding of individual words4. Clustering these abilities (understanding of word meanings, speech fluidity, tendency to repeat simple phrases, comprehension of spatial prepositions, flexible syntax, verb tense, and complex explanations) in patients with aphasia can group co-occurring abilities together according to their underlying brain lesion5,6. Crucially, the clustering analysis can perform grouping calculations in a data-driven manner without any design or hypothesis, identifying and enumerating underlying mechanisms automatically.
Autistic individuals often display language comprehension deficits7,8,9,10,11,12,13. Our previous data-driven clustering analysis of fourteen language comprehension abilities in autistic individuals formed three robustly distinct clusters stable across different evaluation methods, across different age groups, and across different time points1. The cluster of most-basic abilities, termed “command-language-comprehension,” included knowing the name, responding to “No” or “Stop”, and following some commands. The cluster of intermediate abilities, termed “modifier-language-comprehension,” included understanding color and size modifiers, several modifiers in a sentence, size superlatives, and numbers. The cluster of most-advanced abilities, termed the “syntactic-language-comprehension,” included understanding of spatial prepositions, verb tenses, flexible syntax, possessive pronouns, explanations about people and situations, simple stories, and elaborate fairytales.
Each of the three clusters of co-occurring abilities corresponds to a distinct language mechanism. When the most-advanced syntactic mechanism is disabled, then the complete set of syntactic-language-comprehension-abilities that depend on it is impaired. When both the syntactic and the modifier mechanisms are disabled, then the complete set of syntactic- and modifier-language-comprehension-abilities are impaired. Reassuringly, the additional clustering analysis of participants (rather than abilities) identified three language-comprehension-phenotypes: the most-advanced syntactic-language-comprehension-phenotype included individuals who acquired all three language-comprehension mechanisms – syntactic, modifier, and command; the intermediate modifier-language-comprehension-phenotype included individuals limited to modifier and command mechanisms; and the most-basic command-language-comprehension-phenotype included individuals limited to the command mechanism.
The impetus for the aforementioned study was the theoretical work relating language comprehension to imagination. E.g., to understand an instruction containing spatial prepositions “put the cup under/behind/in front of the table,” most people mentally juxtapose the cup and the table. Similarly, to comprehend the change in meaning when the order of words is changed (termed ‘flexible syntax’), e.g., “the turtle rides/carries the cow” versus “the cow rides/carries the turtle,” most people mentally juxtapose the turtle and the cow. As fairytales commonly describe objects and events that do not exist, a reader has to imagine them by juxtaposing various mental objects from memory. In order to comprehend an unfamiliar syntactic sentence, most people (99.2%) mentally combine the subject and the object in front of their mind’s eye14. Thus, a connection between language and imagination was hypothesized by many researchers15,16,17.
The mechanism underlying deliberate juxtaposition of two or more mental objects and observing the result of their interaction is called Prefrontal Synthesis or PFS18,19,20 (a.k.a. mental synthesis21,22,23). Converging evidence suggests that on the neurological level, PFS is mediated by synchronization24,25,26,27,28 of neuronal-ensembles that encode those mental objects29. In other words, in order to understand a syntactic sentence, such as “the dog lives above the cat and under the monkey,” an interlocutor has to synchronize the neuronal-ensembles encoding all three nouns—the dog, the cat, and the monkey. Neuronal ensembles encoding visual objects are located in the posterior cerebral cortex30. The action of their synchronization is mediated by the LPFC18 via frontoposterior connections, such as arcuate fasciculus31.
Conversely, the mechanism of modifying color or size of a single mental object is called Prefrontal Modification or PFM (a.k.a., Prefrontal Analysis19). The neurological difference between PFS and PFM is in the degree of the LPFC ability to control the posterior cortex. In terms of the Dual Visual Stream Theory32, the PFM control is limited to the temporal cortex in an area called ventral pathway, which encodes the appearance of objects (their shape, form, and color)19. PFS adds control of the parietal cortex in an area called dorsal pathway, which encodes where objects are in space (their location, motion, depth, and spatial relationship)33,34. Consequently, PFM is limited to the control of a single mental object’s appearance and PFS can control both mental objects’ appearance and location, enabling juxtaposition of multiple mental objects.
Finally, comprehension of commands, such as “take the ball outside” or “bring the cup to daddy,” does not involve any juxtaposition (PFS) or modification (PFM) of mental objects. On the neurological level, comprehension of simple commands relies on recalling existing object-encoding-neuronal-ensembles from memory. E.g., the command “bring the cup to daddy” requires recalling the cup and daddy; there is no modification of a single object-encoding-neuronal-ensemble or synchronization between several object-encoding-neuronal-ensembles.
All three mechanisms – memory recall, PFM, and PFS—are commonly called “imagination,” “symbolic thinking,” “abstract thinking,” “executive function,” or “cognition”35. However, there are significant benefits to using the three neurologically defined terms. First, these definitions are much narrower. For example, PFM and PFS are always voluntary and deliberate, unlike broadly defined “imagination,” which can be involuntary, such as during REM-sleep dreaming19. The mechanisms of PFM and PFS rely on the LPFC, and patients with the LPFC damage can lose access to these abilities19,36. In contrast, dreaming imagination does not depend on the LPFC; the LPFC is inactive during sleep37,38, and patients with the LPFC damaged do not notice change in their dreams39. Moreover, broadly defined “imagination,” “symbolic thinking,” “abstract thinking,” “executive function,” and “cognition” can be mediated by spontaneous insight, which primarily depends on the ventromedial prefrontal cortex (vmPFC) rather than the LPFC40,41,42. Neither involuntary imagination nor spontaneous insight can mediate language comprehension—their mechanisms are too slow and the results are too unpredictable to sustain a typical conversation. When viewed from a neurological perspective, memory recall, PFM, and PFS stand apart from each other and other mechanisms of imagination, and were anticipated to yield three distinct mechanisms of language comprehension19,36.
The goals of this study were (1) to validate three language mechanisms using the previously employed 14 language comprehension abilities, and in addition including the previously omitted 15th ability (“Responds to praise”); (2) to focus on participants with normal speech, in order to eliminate potential confusion related to the impact of speech on language comprehension; (3) to explore the relationship between the three empirically-identified language mechanisms and the three tentative mechanisms derived from neurological considerations—memory recall, PFM, and PFS; and (4) to expand the participant pool beyond autistic individuals by including other conditions linked to language impairments: mild language delay, apraxia (a motor speech disorder where individuals struggle to plan and coordinate the movements needed for speech despite normal muscle function), Specific Language Impairment, Sensory Processing Disorder, Social Communication Disorder, Down Syndrome, and Attention-Deficit/Hyperactivity Disorder (ADHD). It was hypothesized that since memory recall, PFM, and PFS are fundamental mechanisms of language comprehension, the inclusion of other pathological conditions with language impairments would result in the same three clusters previously identified in autistic individuals1.
Results
Clustering analysis of 15 language comprehension abilities
Parents and caregivers evaluated 15 language comprehension abilities (Table 1) in 55,558 individuals with various language deficits (age range of 4 to 22 years). From these participants, we selected those who showed an ability to “use sentences with four or more words” (defined as normal speech; N = 17,848, Table 2). To explore co-occurrence of the 15 language comprehension abilities, we used two common clustering methods: Unsupervised Hierarchical Cluster Analysis (UHCA) and Principal Component Analysis (PCA). Figure 1a depicts the dendrogram generated by UHCA. The height of the branches indicates the distance between clusters, which is an indicator of greater dissimilarity. Three clusters have inter-cluster distances that are significantly larger than the distances between subclusters. The right-most cluster includes knowing the name, responding to ‘No’ or ‘Stop’, responding to praise, and following some commands (items 1 to 4 in Table 1). This cluster is identical to the command-language-comprehension-cluster identified by the previous study of 31,845 autistic individuals with the addition of the ‘responds to praise’ item that was not previously analyzed1. The cluster in the middle includes understanding color and size modifiers, several modifiers in a sentence, size superlatives, and numbers (items 5 to 8 in Table 1). This cluster is identical to the previously identified modifier-language-comprehension-cluster. The left-most cluster includes understanding of spatial prepositions, verb tenses, flexible syntax, possessive pronouns, explanations about people and situations, simple stories, and elaborate fairy tales (items 9 to 15 in Table 1). This cluster is identical to the previously identified syntactic-language-comprehension-cluster.

a The dendrogram representing the hierarchical clustering of language comprehension abilities. b Principal component analysis of the 15 language comprehension abilities shows a clear separation between command, modifier, and syntactic items. Principal component 1 accounts for 35.0% of the variance in the data. Principal component 2 accounts for 11.0% of the variance in the data.
The PCA (Fig. 1b) also shows clear separation between these three clusters. The command items (knowing the name, responding to ‘No’ or ‘Stop’, responding to praise, and following some commands) are clustered in the top left corner. The modifier items (understanding color and size modifiers, several modifiers in a sentence, size superlatives, and numbers) are clustered in the lower middle. The syntactic items (understanding of spatial prepositions, verb tenses, flexible syntax, possessive pronouns, explanations about people and situations, simple stories, and elaborate fairy tales) are clustered in the top right corner.
The three-cluster solution was stable across multiple seeds as well as consistent across different age groups (4 to 6 years of age, Supplementary Fig. 1; 6 to 12 years of age, Supplementary Fig. 2; 12 to 22 years of age, Supplementary Fig. 3), across different time points (first evaluation, Supplementary Fig. 4; last evaluation, Fig. 1), across different verbal levels (Fig. 1 and Supplementary Fig. 5), across genders (Supplementary Figs. 6 and 7), and across parental education (Supplementary Figs. 8 and 9).
Language comprehension phenotypes in participants
Independently from clustering 15 language comprehension abilities, we utilized UHCA to cluster all 17,848 participants (Fig. 2, the dendrogram on top). This three-cluster solution was stable across different seeds, age groups (4 to 6, 6 to 12, and 6 to 22 years of age, Supplementary Figs. 10–12, respectively, the dendrogram shown on the top), time points (first and last evaluations; Supplementary Fig. 13 and Fig. 2, respectively), across different verbal levels (Fig. 1 and Supplementary Fig. 14), across genders (Supplementary Figs. 15 and 16), and across parental education (Supplementary Figs. 17 and 18).

The 15 language comprehension abilities are shown as rows. The dendrogram representing language comprehension abilities is shown on the left. Participants are shown as 17,848 columns. The dendrogram representing participants is shown on the top. The green bar labels the command mechanism, the red bar labels the modifier mechanism, and the blue bar labels the syntactic mechanism. The center grid indicates the presence or absence of each ability in each participant: blue signifies the presence of a linguistic ability (“very true” answer), red indicates its absence (“not true” answer), and white represents a partial presence (“somewhat true” answer).
A two-dimensional heatmap was used to relate participant clusters to language comprehension abilities clusters (Fig. 2). Columns represent the 17,848 participants and rows represent the 15 linguistic abilities. Blue indicates the presence of a linguistic ability (parent’s response=very true); white indicates an intermittent presence of a linguistic ability (parent’s response=somewhat true); and red indicates the lack of a linguistic ability (parent’s response=not true). The three clusters of participants match the three language comprehension mechanisms. The cluster of participants termed “Syntactic Language Phenotype” shows the predominant blue color that indicated good skills across all three (syntactic, modifier, and command) language comprehension mechanisms (44.7% of participants, Table 3). The cluster of participants marked as “Command Language Phenotype” shows the predominant blue color indicating good skills only among the command-mechanism items (30.4%). The third cluster of participants marked “Modifier Language Cluster” shows the predominant blue color indicating good skills only across modifier- and command-mechanisms items (25%). Table 4 shows cluster assignment of participants by their diagnostic category.
Discussion
Understanding the mechanisms of human language remains a fundamental challenge of linguistics and cognitive neuroscience. Significant efforts have been invested into uncovering language mechanisms by studying contemporary individuals exhibiting language impairments; e.g., these efforts lead to the discovery of the language-critical FOXP2 gene43,44,45. It is evident that not every contemporary individual acquires every language mechanism. Examining language skills in a population of individuals with language deficits enables researchers to detect different aspects of language manifested in tandem, highlighting patterns where specific abilities emerge and decline in concert and providing insights into the common neurological mechanisms that underpin them. Clustering methods automatically organize language abilities based on their co-occurrence, generating hierarchical structures like dendrograms that illustrate relationships among them. If two linguistic abilities share a common underlying mechanism, their absence due to a disruption of this mechanism results in their clustering together. Importantly, the clustering analysis conducted in this study was unbiased, as both UHCA and PCA were data-driven, devoid of any predetermined design or hypothesis.
This study analyzed 15 language comprehension abilities in 17,848 participants with language impairments but normal speech. First, it confirmed the existence of three robustly distinct language comprehension mechanisms discovered in autistic individuals1. Compared to the previous investigation, this study expanded the participant pool to include all individuals whose parents have submitted assessments, regardless of children’s diagnosis. Furthermore, in order to eliminate potential confusion related to the impact of speech, this study only enrolled participants with normal speech. The three identified mechanisms were as follows: (1) the command mechanism mediated comprehension of one’s name, responding to ‘No’ or ‘Stop,’ responding to praise, and following simple commands (Fig. 1); (2) the modifier mechanism mediated comprehension of simple color/size modifiers, understanding of several modifiers in a sentence, understanding of size superlatives, and number comprehension; and (3) the syntactic mechanism mediated comprehension of spatial prepositions, verb tenses, flexible syntax, possessive pronouns, explanations, simple stories, and elaborate fairy tales.
This study also confirmed previous discovery of three language comprehension phenotypes (Fig. 2). Participants in the syntactic-language-phenotype-cluster acquired all three language mechanisms: syntactic, modifier, and command. All neurotypical participants (controls) were clustered into this group. Participants in the modifier-language-phenotype-cluster acquired only modifier and command mechanisms and did not acquire the syntactic mechanism. Participants in the command-language-phenotype-cluster acquired only the command mechanism and did not acquire the syntactic and the modifier mechanisms.
As expected, all neurotypical children were clustered into the syntactic cluster (Table 4). Among the not-diagnosed participants, who include a high proportion of typically developing individuals, the majority exhibited the syntactic language phenotype (65.6%), with the lowest proportion found in the command language phenotype (17.2%). Participants with mild language delay followed a similar pattern, with 57.4% in the syntactic phenotype and 21.1% in the command phenotype. A higher prevalence of the syntactic language phenotype was also observed in participants diagnosed with specific language impairment (52.2%), ADHD (51.4%), apraxia (48.7%), and sensory processing disorder (48.3%).
Participants diagnosed with more severe conditions exhibit a lower proportion of the syntactic phenotype. Those with severe autism had the smallest proportion clustered in the syntactic phenotype (16.2%) and the highest proportion in the command phenotype (60.0%). They are followed by participants with moderate autism, of whom only 24.5% fell into the syntactic phenotype, while 41.7% were clustered in the command phenotype. They were followed by participants diagnosed with Down syndrome, of whom only 37.4% fell into the syntactic phenotype, while 46.3% were clustered in the command phenotype.
The prevalence of the command phenotype increased with Autism Spectrum Disorder (ASD) severity: the command phenotype was observed in 32.8% of individuals diagnosed with mild ASD, in 41.7% of individuals diagnosed with moderate ASD, and in 60.0% of individuals diagnosed with severe ASD (Table 4). Conversely, the prevalence of the syntactic phenotype decreased with ASD severity: the syntactic phenotype was observed in 38.8% of individuals diagnosed with mild ASD, in 24.5% of individuals diagnosed with moderate ASD, and in 16.2% of individuals diagnosed with severe ASD.
Future studies should explore the genetic correlates of the three language phenotypes. However, it is evident that multiple different genetic abnormalities can lead to an inferior phenotype. The functionality of each linguistic mechanism depends on the optimized state of the entire central nervous system, including fine-tuned axonal connections, synaptic configuration, myelination, etc. This optimized state can be conceptualized as an attractor state, defined as a state toward which a child tends to progress from a range of starting conditions of the nervous system46,47. Any single neurological abnormality (synaptic impairment, reduced myelination, cell-surface proteins dysfunction, etc.) can render the most-advanced attractor neurological state inaccessible for a developing child. If the nervous system cannot optimize for the most-advanced syntactic-language-comprehension-phenotype, it may optimize for the modifier-language-comprehension-phenotype; and if it cannot optimize for the modifier-language-comprehension-phenotype, it may still optimize for the command-language-comprehension-phenotype. Our results suggest that a variety of genetic conditions can delay or prevent the achievement of the most-advanced syntactic-language-comprehension-phenotype attractor state.
One of the most interesting aspects of the study involves interpretation of its results in terms of cognitive neuroscience. Even simple syntactic sentences, such as “the cat carries the sheep,” require a complex neurological machinery for their interpretation. First, the Wernicke’s area matches the word ‘cat’ resulting in activation of the neuronal ensemble encoding the cat (i.e., recall of the cat’s image from memory). Second, the Wernicke’s area matches the word ‘sheep’ resulting in activation of the neuronal ensemble encoding the sheep (i.e., recall of the sheep’s image from memory). This memory recall is followed by the LPFC juxtaposing the two recalled objects mediated by the process of prefrontal synthesis (PFS)18,48 governed by the verb ‘carries’ (Fig. 3). If the verb ‘rides’ was used instead, the LPFC would re-arrange the scene, constructing the mental image with the cat on top of the sheep. Thus, the interpretation of this sentence includes at least two neurological mechanisms: (1) recall of the two neuronal ensembles by the Wernicke’s area and (2) their juxtaposition by the LPFC (the PFS mechanism discussed in the introduction).

Individuals with the syntactic phenotype interpret the sentence “the cat carries the sheep” by recalling the cat, the sheep, and juxtaposing the two objects using the prefrontal synthesis (PFS) mechanism. In contrast, individuals with the command and modifier phenotypes do not acquire the PFS mechanism, preventing them from understanding the relational aspect of sentences—specifically, ‘who carries whom’.
Conversely, understanding of a sentence “give me the short blue pencil” does not involve PFS, since it describes a single object (pencil). The Wernicke’s area first matches the word ‘pencil’ resulting in activation of the neuronal ensemble encoding the pencil (i.e., recall of the pencil’s image from memory); then the LPFC modifies the pencil-encoding neuronal ensemble to imbue it with the smaller size and the blue color using the process of prefrontal modification (Fig. 4; the PFM mechanism discussed in the introduction).

Individuals with the syntactic and modifier phenotypes can follow the instruction “give me the short blue pencil” by recalling a pencil, and modifying its attributes, such as color and length, using the prefrontal modification (PFM) mechanism. In contrast, individuals with the command phenotype can understand individual words, but do not acquire the PFM mechanism. As a result, they struggle to identify the correct pencil among short/long straws, pencils, and Legos of different colors. This difficulty in integrating multiple elements into a coherent whole is known as stimulus overselectivity, tunnel vision, or the lack of multi-cue responsivity77,78,79,80.
Finally, understanding of the command sentence type is mediated by neither PFS nor PFM. To interpret the sentence “take the cup to daddy,” an individual with the command-language-comprehension-phenotype only needs to recall the words ‘cup’ and ‘daddy.’ The action is implicit—no other reasonable action is possible between the cup and daddy. Neurologically, this instruction is understood by the Wernicke’s area activation of the neuronal ensemble encoding the cup (i.e., recall of the cup’s image from memory); then by the Wernicke’s activation of the neuronal ensemble encoding daddy (i.e., recall of the daddy’s image from memory). The two separate neuronal ensembles remain in working memory. This interpretation involves no object combination or object modification (i.e., neither PFS nor PFM).
Accordingly, from the neurological perspective, all sentence types that require a combination of two or more mental objects are expected to be mediated by the PFS mechanism and therefore cluster together; all sentence types that require modification of a single object’s color or size are expected to be mediated by the PFM mechanism and cluster into the second group; and all sentence types that do not require modification or juxtaposition of objects are expected to be mediated by memory recall and cluster into the third group. These predictions were borne out of data.
Out of fifteen sentence types investigated in this study, seven types involved a combination of two or more mental objects and therefore, were expected to be mediated by the PFS mechanism. Sentences with spatial prepositions (Table 1, item 9) require juxtaposition of a subject and an object in a manner governed by spatial prepositions (on, under, behind, etc.) in order to understand them. Semantically-reversible sentences that change their meaning when the order of words is changed (i.e., ‘a cat ate a mouse’ vs. ‘a mouse ate a cat’; Table 1, item 14) also require juxtaposition a subject and an object in order to understand them. Simple stories and elaborate fairytales (Table 1, items 11 and 12) usually describe objects and events that do not exist, and a reader has to imagine them by juxtaposing various mental objects from memory. Similarly, understanding explanations about people, objects, or situations beyond the immediate surroundings (e.g., “Mom is walking the dog,” “The snow has turned to water,” Table 1, item 15) commonly require juxtaposition of a subject and an object. Understanding of sentences with possessive pronouns (Table 1, items 13) also requires a combination of two mental objects: e.g., “her apple” requires a combination of a female and an apple; “mom’s sock” requires a combination of mom and a sock. Finally, understanding verb tenses (i.e., I will eat an apple vs. I ate an apple, Table 1, items 10) also requires a combination of a subject and an object governed by the verb. All these sentence types require a combination of two or more mental objects (Table 1, items 9 to 15) mediated by the PFS mechanism and thus expected to cluster together. This prediction was borne out of data; all seven sentence types have been clustered into the most-advanced syntactic mechanism (Fig. 1). None of the sentence types that do not require combination of two or more mental objects (Table 1, items 1 to 8) have been clustered into the most-advanced syntactic mechanism. This suggests that the most-advanced syntactic mechanism is mediated by the PFS.
Sentence types that require modification of attributes of a single mental object include the following: understanding some simple modifiers (i.e., green apple vs. red apple or big apple vs. small apple, Table 1, item 5); understanding several modifiers in a sentence (i.e., small green apple, Table 1, item 6); and understanding size (can select the largest/smallest object out of a collection of objects, Table 1, item 7). Accordingly, these sentence types were predicted to be mediated by the PFM mechanism and thus expected to cluster together. This prediction was borne out of data (Fig. 1, the modifier mechanism). Additionally, understanding small numbers (Table 1, item 8; e.g., two apples vs. three apples,) clustered within the modifier mechanism, suggesting that small number interpretation is mediated by the PFM mechanism as well.
Sentences that do not require modification and juxtaposition of mental objects and therefore do not rely on the PFM and PFS mechanisms include the following: knowing own name (Table 1, item 1); responding to ‘No’ or ‘Stop’ (Table 1, item 2); responding to praise (Table 1, item 3); and following some commands (Table 1, item 4). These sentence types were predicted to cluster separately from those relying on syntactic and modifier mechanisms, the phenomenon borne out of the data. This finding suggests that the command-language-comprehension-mechanism requires neither the PFS nor the PFM mechanisms.
It is important to note that in individuals with the syntactic-language-comprehension-phenotype, PFS can accompany interpretation of command sentences: neurotypical adults can visualize a novel scenario of “taking the cup to daddy.” However, this does not imply that interpreting command sentences always requires PFS. Individuals with the command-language-comprehension-phenotype can accurately interpret commands, such as “take the ball outside” and “bring the cup to daddy” simply by recalling one object at a time, without PFM or PFS.
An additional mechanism of interpretation of syntactic sentences that does not rely on PFS is routinization. Once a sentence type has been routinized, an interlocutor only needs to recall the objects mentioned in the sentence and follow the algorithm. For instance, given the instruction “Put the red cup inside the green cup,” a routinized algorithm would involve (1) lifting the cup mentioned first and (2) inserting it into the cup mentioned second. This method achieves the correct result by relying solely on memory recall of the red cup and green cups; without requiring PFS.
Accordingly, routinized interpretation of syntactic sentences can be taught to individuals who do not acquire PFS. However, the routinized method is rigid and supports only canonical-word-order processing (i.e., “put the red cup inside the green cup”). It fails with non-canonical-word-order instructions (i.e., “inside the green cup, put the red cup”). This limitation is evident from our recent study that compared typical children (~5 years of age) with autistic adolescents (~18 years of age)49. Typical children understood both canonical and non-canonical instructions equally well. In contrast, autistic adolescents were able to complete the canonical stacking cups task but only half of them were able to complete the same task under the condition of non-canonical-word-order (participants were not time-limited and the instruction was repeated multiple times). Those who failed usually selected the correct cups, but assembled them incorrectly (Supplementary Movie 1). Autistic adolescents who failed to understand non-canonical sentences also failed to understand spatial prepositions and complex explanations, suggesting that they did not acquire PFS49. Their success with canonical instructions likely reflects routinization through extensive Applied Behavior Analysis (ABA) therapy, which often includes tasks like stacking cups.
Several limitations of the study should be acknowledged. Epidemiological studies leveraging app users as participants offer access to a substantial number of individuals, but they do have some evident drawbacks, such as reliance on parental reports and parent-provided diagnoses. On one hand, parents may be prone to wishful thinking and their reports may overestimate their children’s abilities50. On the other hand, parents possess a deep understanding of their children, which can be particularly valuable for assessing syntactic language skills that can be challenging to evaluate in a clinical setting. Furthermore, several previous studies have indicated that parent reports of language abilities align closely with direct assessments conducted by clinicians51,52,53. Our own database studies also support the consistency and reliability of parent reports54,55,56.
The diagnosis selection dialog was designed to be as easy as possible for caregivers. First, caregivers were permitted to select only one diagnosis. Second, the presence or absence of autism was emphasized. For example, the option for ADHD was labeled: “No autism. Diagnosed with ADHD or ADD.” Consequently, while this design simplified the selection process, it also means we cannot rule out the possibility that language impairments may be associated with participants diagnosed with ADHD.
Another limitation of the study is that the majority of participants were in the 4 to 8 years of age range. That is the nature of data collection procedure through language therapy app. Future studies should aim to include a broader age range to capture data from older participants as well.
The parent survey used in this study assessed the understanding of fifteen sentence types. It is theoretically possible that if more sentence types were used in the survey, these types could cluster into some additional mechanisms. However, based on the cognitive neuroscience interpretation, we predict that interpretation of all sentence types falls into the three mechanisms: syntactic, modifier, and command. For instance, although recursive sentence structures were not included in the survey, we anticipate they would fall under the syntactic mechanism. Chomsky explains recursion in spatial reasoning with examples such as “((((the hole) in the tree) in the glade) by the stream)” and suggests that there is no limit to such embeddings of place concepts within place concepts57. This process of recursion, as used in spatial reasoning, aligns with the syntactic mechanism. Neither the command-, nor modifier-language-comprehension-mechanisms enable recursive sentences, as colors, sizes, and numbers alone cannot facilitate construction of infinite recursive sentences. The syntactic-language-comprehension phenotype, however, does encompass the comprehension of recursion; e.g., comprehension of spatial prepositions, which is a part of the syntactic phenotype is sufficient for understanding Chomsky’s example of recursion in spatial reasoning.
This study focused on verbal participants, which reduced the sample size from 55,558 to 17,848 and skewed the selection towards individuals with higher linguistic abilities. Consequently, some clusters have shown an increased heterogeneity. For example, the syntactic cluster in Fig. 1b (based on the smaller sample of verbal participants) appears less homogeneous compared to the syntactic cluster in Supplementary Fig. 5 (derived from the full sample). Despite the smaller sample size, this approach offers significant advantages, particularly in minimizing potential confounding effects of speech difficulties on language comprehension.
The discovery of three mechanisms of language comprehension can help advance research in related fields. Our results underscore the limitations of broad terms, such as “symbolic thinking,” “abstract thinking,” “executive function,” “cognition,” “visual integration,” and “imagination” in adequately describing the mechanisms underlying syntactic language comprehension. Without clearly defining the subcomponents of symbolic thinking – PFS, PFM, and memory recall – we would not have been able to interpret the three empirically identified language-comprehension mechanisms (Fig. 1).
In linguistics, the idea that language comprehension consists of several components has been articulated by the Merge hypothesis58,59. Merge is defined as a cognitive operation that takes two linguistic units (e.g., ‘blue’ and ‘cat’) and forms a set (‘blue cat’), which can then be further combined with additional linguistic units. The field of linguistics has coalesced on the hypothesis that unbounded Merge is a uniquely human ability, and this operation is the generative engine underlying the ability of humans to communicate infinite expressions60,61. A key goal of the Merge hypothesis is to link linguistics with neurology by identifying a uniquely human neurological mechanism underlying Merge. Our clustering analysis of language abilities in individuals with language impairments provides a path to explore this connection. Unbounded Merge is able to mediate the comprehension of syntactic (Table 1, items 9–15) and modifier (Table 1, items 5–8) structures, but not command structures, which are limited by definition to comprehension of single words (Table 1, items 1–4). This distinction positions Merge as a separator between modifier and syntactic abilities on one hand, and command abilities on the other. Our finding of a large cluster of individuals limited to command abilities supports the Merge hypothesis by confirming that command abilities are mediated by a mechanism separate from Merge. Moreover, our results refine the hypothesis by distinguishing two subcomponents of Merge. 25% of participants in our study acquired the modifier mechanism without developing the syntactic mechanism. This suggests a division within the Merge operation: “simple Merge,” which in our study underlies the comprehension of color, size, and number modifiers and is mediated by PFM, and “complex Merge,” which underlies the comprehension of syntactic structures and is mediated by PFS.
Future studies should explore the genetic correlates of the three language comprehension phenotypes. Identifying genes that hinder the acquisition of syntactic language could pave the way for new treatment discoveries. Moreover, pinpointing language-critical genes may enhance our understanding of human language evolution57,62,63.
In the absence of pharmaceutical treatments capable of improving a language phenotype, language therapy remains the primary approach. Language therapists can benefit from a more precise characterization of language comprehension phenotypes. First, the current terminology primarily describes an individual’s communication level in terms of their use of spoken language, categorizing individuals as nonverbal, minimally verbal, or verbal. However, this study highlights the importance of assessing communication levels based on their language comprehension phenotype in addition to the spoken language level (all study participants were verbal and still fell into the three distinct language comprehension phenotypes). Second, with a clearer understanding of phenotypes, language interventions can be more effectively tailored to each individual. For instance, those in the command phenotype may benefit more from exercises focused on color, size, and number modifiers, as syntactic exercises might be too challenging for them. Third, improved language comprehension assessments can be developed (at the level reached by typical children aged 2 to 4 years), facilitating earlier recognition of comprehension difficulties and a better tracking of children’s progress. This, in turn, could lead to earlier interventions and more focused therapy, which over time may enable children to achieve higher language comprehension phenotypes.
Methods
Study participants
Participants were children and adolescents using a language therapy app that was made freely available at all major app stores in September 201548,64,65,66,67. The app provides various structured language comprehension therapy exercises and is primarily used by parents of children with language impairments. Once the app was downloaded, caregivers were asked to register and to provide demographic details, including the child’s diagnosis and age. Caregivers consented to pseudonymized data analysis and completed a 133-item questionnaire (77-item Autism Treatment Evaluation Checklist68, Supplementary Tables 1 to 4; 20-item Mental Synthesis Evaluation Checklist (MSEC)69, Supplementary Table 5; 10-item screen time checklist55; 25-item diet checklist70; and 1-item parent education survey) approximately every three months.
All fifteen available language comprehension items from the 133-item questionnaire were included in the cluster analysis (Table 1). Compared to the previous study1, item 3 “Responds to praise” was added to clustering analysis. Given that most praise is delivered verbally, this item assesses an individual’s language comprehension, justifying its inclusion in the cluster analysis. Answer choices were as follows: very true (0 points), somewhat true (1 point), and not true (2 points). A lower score indicates better language comprehension ability.
The inclusion criteria for this study remained consistent with those of the previous study1: absence of seizures (which commonly result in intermittent, unstable language comprehension deficits71), absence of serious and moderate sleep problems (which are also associated with intermittent, unstable language comprehension deficits56), age range of 4 to 22 years (the lower age cutoff was chosen to ensure that participants were exposed to complete set of sentence structures listed in Table 172; the upper age cutoff was chosen to avoid analysis of participants who may be linguistically declining due to aging). The previous study was limited to individuals diagnosed with Autism Spectrum Disorder (ASD). This study included all participants who submitted their assessments through the app (N = 55,510). Table 2 reports participants’ diagnoses as communicated by caregivers. Autism level (mild/Level 1, moderate/Level 2, or severe/Level 3) was reported by caregivers. Pervasive Developmental Disorder and Asperger Syndrome were combined with mild autism for analysis as recommended by DSM-573. A good reliability of such parent-reported diagnosis has been previously demonstrated54. As a control we have added 48 neurotypical children of 4 to 10 years of age, whose data were previously collected for a different study72 by approaching parents on a community online site and asking if they would be willing to complete a Google form. Participants identified by caregivers as ‘no autism, normally developed child’ are labeled in Table 2 as the ‘Not-diagnosed’ group in order to differentiate them from the 48 neurotypical controls.
The goal of this study was to focus on participants with normal speech, in order to eliminate potential confusion related to the impact of speech. Therefore, from these 55,558 individuals, we selected those who showed an ability to “use sentences with four or more words” (parents had to respond ‘very true’ to this question; the complete list of options was ‘very true,’ ‘somewhat true,’ and ‘not true’). Since normal speech is usually associated with mild ASD, this group is overrepresented of the three levels of ASD severity (Table 2).
When caregivers have completed several evaluations, the last evaluation was used for analysis. Thus, the study included a total of 17,848 participants, the average age was 6.3 ± 2.4 years (range of 4 to 21.9 years), 72.3% participants were males. The education level of participants’ parents was the following: 91.7% with at least a high school diploma, 71.4% with at least college education, 39.0% with at least a master’s, and 6.8% with a doctorate. All caregivers consented to pseudonymized data analysis and publication of results. The study was conducted in compliance with the Declaration of Helsinki74. Using the Department of Health and Human Services regulations found at 45 CFR 46.101(b)(4), the Biomedical Research Alliance of New York (BRANY) LLC Institutional Review Board (IRB) determined that this research project is exempt from IRB oversight (IRB File # 22-12-205-1120). The data were accessed on 5th of April, 2023. Supplementary Movie 1 is reprinted with permission from ref. 75. Authors have obtained written parental consent to publish the video.
Statistics and reproducibility
UHCA was performed using Ward’s agglomeration method with a Euclidean distance metric. Clustering analysis was data-driven without any design or hypothesis. Two-dimensional heatmap was generated using the “pheatmap” package of R, freely available language for statistical computing76. Code is available from the corresponding author upon reasonable request.
As a control we calculated UHCA and PCA of the 15 language comprehension abilities along with the “hyperactivity” item graded on the scale of four: ‘not a problem’, ‘mild problem,’ ‘moderate problem,’ and ‘severe problem.’ From a neurological perspective, hyperactivity is not related to any particular language ability and therefore should cluster into its own group. As expected, both UHCA and PCA clustered the hyperactivity item into its own group at a significant distance from the three language clusters, validating the effectiveness of both clustering techniques (Supplementary Fig. 19).
Responses