Latent circuit inference from heterogeneous neural responses during cognitive tasks

Main

Cognitive functions depend on higher cortical areas, which integrate diverse sensory and contextual signals to produce a coherent behavioral response. These computations result from interactions between excitatory and inhibitory neurons in cortical circuits. Traditionally, hand-crafted neural circuit models were used to pose specific mechanistic hypotheses about how excitation and inhibition between a few neural populations representing task variables control the flow of information from input to behavioral output^{1,2,3,4,5,6,7,8,9,10,11}. Because these circuit models usually assume a relatively simple connectivity structure, their connectivity can be directly related to a dynamical system description of computations supporting cognitive task execution^5,6. Thus, these models explicitly specify a circuit mechanism in the connectivity structure that gives rise to a dynamical mechanism controlling the flow of neural trajectories to implement task computations. By linking connectivity to neural dynamics and behavioral output, these models can predict changes in dynamics and behavioral performance under perturbations of the circuit structure (for example, changes in excitation–inhibition balance¹²) and thus can be causally validated in experiments^13,14,15. However, hand-crafted circuit models come short of capturing the complexity and heterogeneity of single-neuron responses in the cortex.

Single neurons in areas such as the prefrontal cortex (PFC) show complex heterogeneous tuning to multiple task variables^{16,17,18,19,20,21}, which presents a formidable challenge for identifying underlying circuit mechanisms. Similar heterogeneous responses emerge in high-dimensional recurrent neural network (RNN) models trained to perform cognitive tasks^19,22,23,24 or reproduce neural activity data^{25,26,27,28,29} by optimizing recurrent connectivity parameters. Although trained RNNs are a class of neural circuit models, the complexity of their high-dimensional activity and connectivity obscures the interpretation of circuit mechanisms in these networks³⁰. It has been possible to determine dynamical mechanisms of task computations by either characterizing fixed points and linearized dynamics around them in RNNs^19,31 or fitting a dynamical system directly to neural response data^{27,32,33,34,35}. These approaches, however, provide no insight into how particular features of the flow field in a dynamical system arise from the network connectivity. Thus, high-dimensional RNNs currently serve as intermediate mechanistic models of heterogeneous neural responses, which yield dynamical mechanisms but leave the underlying circuit mechanisms unknown.

Although single-neuron responses are complex and heterogeneous, their joint population activity is often low dimensional across many cognitive tasks and brain areas^30,36. Accordingly, dimensionality reduction methods are commonly used to reveal representations of low-dimensional latent variables in neural population activity, which reflect computations emerging at the population level. Because unsupervised methods do not explicitly model task inputs and behavioral outputs, the latent variables they infer may be unrelated to the cognitive task execution^{26,37,38,39,40}. Therefore, targeted dimensionality reduction methods directly model neural representations of task inputs and behavioral outputs by seeking low-dimensional projections of neural population activity that correlate with external task variables^{19,21,41,42,43} (Fig. 1a,b). However, unlike RNNs and circuit models, these correlation-based methods do not incorporate recurrent interactions among task variables, which implement computations necessary to solve the task. Hence, it remains uncertain whether neural representations uncovered by these methods bear any relevance for driving behavior and have the causal predictive power comparable to the dynamical and circuit mechanisms.

Latent circuit inference from heterogeneous neural responses during cognitive tasks — **Fig. 1: Latent circuit model of heterogeneous neural responses during cognitive tasks.**

The glaring gap between circuit mechanisms and correlation-based dimensionality reduction methods is apparent in studies of the PFC’s role in context-dependent decision-making. Context-dependent decision-making requires flexible trial-by-trial switching between alternative stimulus–response mappings. Most circuit models hypothesize a relatively simple mechanism based on inhibition of sensory representations irrelevant in the current context^{8,9,10,44,45,46}. By contrast, dimensionality reduction methods applied to PFC data or RNN activity show minimal suppression of irrelevant sensory responses^19,21, seemingly invalidating the inhibitory circuit mechanism. These results suggest that RNNs and conceivably the PFC implement qualitatively novel task solutions that do not exist in small circuits and emerge only in high-dimensional networks^19,47,48. However, because correlation-based dimensionality reduction methods bear no links to the connectivity and causal mechanisms, whether heterogeneous neural responses during cognitive tasks arise from low-dimensional circuit mechanisms remains an open question.

To bridge this gap, we develop the latent circuit model, a dimensionality reduction approach that jointly fits neural responses and task behavior and incorporates recurrent interactions among task variables to capture a causal mechanism of task execution. Our model fits neural responses with dynamics generated by a low-dimensional latent circuit, thereby directly testing the hypothesis that heterogeneous neural responses arise from a low-dimensional circuit mechanism. The model simultaneously infers a low-dimensional latent circuit connectivity generating task-relevant dynamics and heterogeneous mixing of these dynamics in single-neuron responses. We applied latent circuit inference to RNNs optimized on a context-dependent decision-making task and found a circuit mechanism based on the inhibition of irrelevant sensory representations. We validated this mechanism by confirming the behavioral effects of patterned perturbations of the RNN activity and connectivity predicted by the latent circuit model. Moreover, fitting the latent circuit model to neural recordings from the PFC of monkeys performing the same context-dependent decision-making task revealed a qualitatively similar suppression mechanism, in contrast to previous analyses of the same data with correlation-based methods^19,21. Using RNN perturbations, we show that dimensionality reduction methods that do not incorporate causal interactions among latent variables are biased toward uncovering behaviorally irrelevant representations. Our results show that high-dimensional networks use low-dimensional circuit mechanisms, establish the feasibility of inferring these mechanisms from neural response data and open new possibilities of causally validating circuit mechanisms in perturbation experiments.

Results

Latent circuit model

To bridge the gap between dimensionality reduction, circuit mechanisms and single-neuron heterogeneity, we develop a latent circuit model (Fig. 1c). Similar to other dimensionality reduction methods, we model high-dimensional neural responses (yin {{mathbb{R}}}^{N}) (N is the number of neurons) during a cognitive task using low-dimensional latent variables (xin {{mathbb{R}}}^{n}) as

$$y=Qx,$$

(1)

where (Qin {{mathbb{R}}}^{Ntimes n}) is an orthonormal embedding matrix and n ≪ N. The latent variables x are constrained to be nodes in a neural circuit with dynamics

$$dot{x}=-x+f({w}_{{textrm{rec}}}x+{w}_{{textrm{in}}}u),$$

(2)

where f is a rectified linear (ReLU) activation function. The latent nodes interact via the recurrent connectivity w_rec and receive external task inputs u through the input connectivity w_in. We also require the latent circuit to perform the task, that is, we can read out the task outputs z from circuit activity via the output connectivity w_out:

$$z={w}_{{rm{out}}}x.$$

(3)

The latent circuit model captures task-related neural activity in the low-dimensional subspace spanned by the columns of Q, with dynamics within this subspace generated by the neural circuit Eq. (2). We infer the latent circuit parameters (Q, w_rec, w_in and w_out) from neural activity y by minimizing the loss function L = ∑_k,t∥y − Qx∥₂ + ∥z − w_outx∥₂, where k and t index trials and time within a trial, respectively (Methods).

In the latent circuit model, the heterogeneity of single-neuron responses has three possible sources: mixing of task inputs to the latent circuit via w_in, recurrent interactions among latent nodes via w_rec and linear mixing of representations in single neurons via the embedding Q. The orthonormality constraint on Q implies that the projection defined by the transpose matrix Q^T is a dimensionality reduction in which projection onto the ith column of Q correlates with the activity of the ith node in the latent circuit. Conversely, the image of each latent node i is a high-dimensional activity pattern given by the column q_i of the matrix Q. Thus, the latent circuit provides a dimensionality reduction that incorporates an explicit mechanistic hypothesis for how the resulting low-dimensional dynamics are generated.

In general, it is not obvious under what circumstances we can satisfactorily fit a latent circuit model to the responses of a high-dimensional system. If, for example, solutions to cognitive tasks that emerge in large systems are qualitatively different from mechanisms operating in small circuits, then we should not be able to adequately fit task-related dynamics of the large system with a low-dimensional circuit model. However, the existence of a low-dimensional circuit solution that accurately captures dynamics of the large system would suggest that this circuit mechanism may be latent in the high-dimensional system.

Interpreting latent connectivity

The advantage of the mechanistic model for latent dynamics is that we can interpret the latent connectivity and relate it to the connectivity of the high-dimensional system. In this context, RNNs optimized to perform a cognitive task provide an ideal setting for testing and validating the latent circuit inference. RNNs mimic the heterogeneity and mixed selectivity of neural responses in the cortex during cognitive tasks while providing full access to each unit’s activity, network connectivity and behavioral outcomes.

To interpret the latent connectivity, we differentiate the embedding Eq. (1) to obtain the correspondence between vector fields of the high-dimensional and low-dimensional dynamical systems (Fig. 1d and Methods). We can then derive an explicit relationship between connectivity matrices of a high-dimensional RNN and a low-dimensional latent circuit. We consider RNNs with dynamics

$$dot{y}=-y+f({W}_{{rm{rec}}},y+{W}_{{rm{in}}}u),$$

(4)

where W_rec and W_in are the recurrent and input connectivity matrices, respectively (Methods). Using the fact that the vector fields of the RNN and latent circuit are piecewise-linear dynamical systems, we derive a relationship between their recurrent and input connectivity matrices (Methods):

$${Q}^{T}{W}_{{rm{rec}}}Q={w}_{{rm{rec}}},;{Q}^{T}{W}_{{rm{in}}}={w}_{{rm{in}}}.$$

(5)

The relation Eq. (5) shows that the latent circuit connectivity w_rec is a low-rank structure in the connectivity of a high-dimensional network, which captures interactions among the latent variables defined by the columns of Q. This relation does not necessarily imply that the full recurrent connectivity W_rec is low rank^47,49,50. Rather, it is a weaker condition that the linear subspace defined by Q is an invariant subspace of the high-dimensional recurrent connectivity matrix. Because, in practice, we search for the latent circuit by minimizing the loss function L, if L is not exactly equal to 0, then Eq. (5) holds only approximately.

The relation between connectivity matrices has the powerful consequence that we can validate the latent circuit mechanism directly in the RNN connectivity. First, if the latent circuit faithfully describes the mechanism operating in the RNN, by conjugating the RNN connectivity matrix with Q (Eq. (5)), we expect to find low-dimensional connectivity structure similar to the latent circuit connectivity. Such an agreement is nontrivial because the latent circuit inference uses only RNN activity without knowledge of the RNN connectivity. Second, Eq. (5) enables us to translate connectivity perturbations in the latent circuit onto the connectivity perturbations in the RNN. Specifically, a change in the connection δ_ij between nodes i and j in the latent circuit maps onto a rank-one perturbation of the RNN connectivity matrix (Fig. 1e and Methods),

$${delta }_{ij}to {q}_{i}{q}_{j}^{T},$$

(6)

where q_i is the ith column of Q. By translating latent connectivity perturbations onto the RNN, we can verify whether these connectivity perturbations affect RNN behavioral performance as predicted by the latent circuit model. The validation of inferred circuit mechanisms via RNN perturbations is critical because the fit quality alone does not guarantee that the inferred model captures the correct mechanism that generated data⁴⁰. Thus, confirming predicted behavioral effects of connectivity perturbations establishes the existence of the inferred low-dimensional circuit mechanism in the RNN.

Latent circuit for context-dependent decision-making

We applied our latent circuit inference to RNNs optimized to perform a context-dependent decision-making task, which requires the discrimination of either the color or motion feature of a sensory stimulus depending on the context cue¹⁹ (Fig. 2a,b and Methods). The RNN successfully learns the task; it makes choices according to the relevant stimulus and ignores the irrelevant stimulus in each context (Fig. 2c). After training, RNN units show heterogeneous mixed selectivity for multiple task variables (Supplementary Fig. 1). The structure in the RNN connectivity responsible for generating the correct behavioral outputs is not immediately obvious (Fig. 2d).

**Fig. 2: RNN model of context-dependent decision-making.**

We fitted the latent circuit model to the responses of RNN units and its output behavior during the task. The latent circuit model consisted of eight nodes corresponding to task variables: two context nodes, two sensory color nodes, two sensory motion nodes and two choice nodes. The identity of each node is derived from its input or output, facilitating the interpretation of the latent circuit mechanism. This choice of the latent circuit dimensionality agrees with the dimensionality of RNN responses after training, which is usually close to the total number of inputs and outputs (the first eight principal components accounted for 97.9% of the total variance in RNN responses; Supplementary Fig. 2). The fitted latent circuit model captured an overwhelming amount of variance in the RNN activity (coefficient of determination r² = 0.96 on test data) and accurately matched projected RNN trajectories (Supplementary Fig. 3).

The inferred recurrent connectivity w_rec of the latent circuit revealed an interpretable mechanism for context-dependent decision-making (Fig. 3a,b). In the latent circuit, sensory nodes representing stimuli associated with the left choice (left motion and green color) have excitatory connections to the left choice node, and sensory nodes representing stimuli associated with the right choice (right motion and red color) have excitatory connections to the right choice node. This pattern of connections from sensory to choice nodes implements two alternative stimulus–response mappings in the task. Further, the color context node has inhibitory connections to the sensory nodes representing motion, and the motion context node has inhibitory connections to sensory nodes representing color. This pattern of connections from the context nodes to the sensory nodes implements a suppression mechanism that inhibits the irrelevant stimulus–response mapping in each context. Because the irrelevant sensory representation is suppressed, it does not drive the decision output. This suppression mechanism based on inhibition of irrelevant representations is qualitatively similar to mechanisms for context-dependent decision-making hypothesized in previous hand-crafted neural circuit models^8,9.

**Fig. 3: Latent circuit mechanism in the RNN performing a context-dependent decision-making task.**

We verified that the suppression mechanism revealed in the latent circuit connectivity depended significantly on the RNN responses beyond the constraints imposed by the task alone (Extended Data Fig. 1 and Methods), suggesting that this mechanism reflects the dynamics of the RNN. We therefore proceeded to validate the inferred circuit mechanism directly in the RNN in three ways: in the RNN activity, in the RNN connectivity and by confirming behavioral effects of the RNN perturbations predicted by the latent circuit model.

First, we verified the signatures of the suppression mechanism in the RNN activity. We projected RNN responses onto the columns of Q, which represent RNN activity patterns that correlate with the activity of nodes in the latent circuit (Fig. 3c). By projecting RNN responses onto the difference of two columns of Q corresponding to the context nodes, we obtain a one-dimensional latent variable correlated with the activity difference of the motion context and color context nodes in the latent circuit. This projection shows RNN trajectories diverging into opposite directions in state space according to context. Next, the choice axis is the difference of two columns of Q corresponding to the left and right choice nodes in the latent circuit. Projecting RNN activity onto the choice axis reveals trajectories separating according to choice regardless of context. Further, the motion axis is the difference of columns of Q corresponding to the left and right motion nodes, and the color axis is the difference of columns of Q corresponding to the red and green color nodes. Projections of RNN activity onto the motion and color axes reveal representations of relevant sensory stimuli, whereas representations of irrelevant stimuli are suppressed. In particular, along the color axis, RNN trajectories separate according to color coherence only on color context trials, whereas on motion context trials, the activity along this axis is suppressed. Similarly, activity along the motion axis is suppressed on color context trials. The suppression of irrelevant sensory representations in RNN activity is consistent with the inhibitory mechanism revealed in the latent circuit connectivity w_rec.

Second, we used the connectivity relationships Eq. (5) to directly validate the latent circuit mechanism in the RNN connectivity. We conjugated the RNN connectivity matrices with the embedding matrix Q. The resulting matrices closely match the connectivity in the latent circuit (Fig. 3d; correlation coefficient r = 0.89). This agreement confirms that the latent connectivity structure indeed exists in the RNN.

Finally, to ultimately validate that this latent connectivity structure supports the behavioral task performance, we tested whether patterned perturbations of the RNN connectivity (Eq. (6)) produced the same behavioral effects as predicted by the latent circuit model. We consider two perturbations designed to test the inhibitory mechanism. The first perturbation ‘turns off’ the context mechanism by weakening the inhibitory connections from a context node to sensory nodes representing irrelevant stimuli in that context (Fig. 4a). In the RNN, this perturbation maps onto a rank-one change in the recurrent connectivity (Fig. 4b). The latent circuit mechanism predicts that weakening the inhibitory connections from the motion context node to sensory nodes representing color would make the circuit sensitive to the irrelevant color information on motion context trials. Indeed, weakening these connections in the latent circuit produced the predicted behavioral effect in the psychometric function, visible as a rotation of the decision boundary on motion context trials (Fig. 4a). Perturbations of the RNN connectivity along the corresponding pattern produced similar behavioral effects (Fig. 4b), confirming that this connectivity pattern implements suppression of irrelevant sensory representations in the RNN.

**Fig. 4: Validating the circuit mechanism via perturbations of RNN connectivity.**

The second perturbation ‘turns off’ one of the stimulus–response mappings by weakening the excitatory connection from a sensory node to a choice node. The latent circuit mechanism predicts that weakening the excitatory connection from the red color node to the right choice node (Fig. 4c) would impair the network’s ability to make right choices on the color context trials. Weakening this connection in the latent circuit indeed decreased the frequency of right choices on color context trials (Fig. 4c). This perturbation maps onto a rank-one connectivity perturbation in the RNN, which produced similar behavioral effects (Fig. 4d). This result confirms both the behavioral relevance of the latent sensory representation and the excitatory mechanism by which it drives choices in the RNN.

Together, these results confirm that the RNN uses the suppression mechanism in which context representations inhibit irrelevant sensory representations. This mechanism is reflected in the low-dimensional dynamics revealed by projecting RNN activity onto axes defined by the latent circuit embedding Q. We identified this mechanism as a latent low-dimensional structure in the RNN connectivity and ultimately validated it by confirming behavioral effects of the RNN connectivity perturbations.

Space of latent circuit mechanisms

We next asked whether RNNs trained on the same task arrive at different circuit solutions for context-dependent decision-making. The latent circuit inference enables us to determine whether different RNNs use the same circuit mechanism. Although two RNNs trained on the same task may have distinct high-dimensional connectivity, the latent circuit inference can reveal whether these RNNs use similar low-dimensional connectivity structure to generate task-relevant dynamics. Therefore, we assessed the similarity of task solutions in RNNs by comparing their low-dimensional latent connectivity.

To explore the space of circuit mechanisms, we trained an ensemble of 200 RNNs with randomly initialized connectivity to the same level of task performance (r² = 0.93 ± 0.01, coefficient of determination for the RNN and target outputs, mean ± s.d. across networks; Supplementary Fig. 4). For each of these RNNs, we fitted an ensemble of 100 latent circuit models starting with random initializations of the parameters for latent connectivity and embedding Q and selected 10 latent circuits with the highest fit quality on the test data, which formed the set of converged solutions (Methods). All converged latent circuits provided accurate fits of the RNN responses (Supplementary Fig. 4).

To visualize the space of latent circuit solutions, we applied a principal component analysis to the flattened connectivity matrices w_rec and projected the data onto the first two principal components, which accounted for 42% of total variance (Fig. 5a). The converged latent circuits fitted to the responses of a single RNN fell within a close proximity of each other (Fig. 5a), and their connectivity was highly correlated (r = 0.98 ± 0.02, mean correlation coefficient between connectivity weights of the best and other converged circuits, mean ± s.d. across RNNs; Methods), which indicates the uniqueness of the latent circuit mechanism in each particular network. The variability of task solutions was much greater across RNNs; the variance of latent connectivity across all RNNs was about four times the average variance in latent connectivity across multiple fits of a single RNN (variance ratio of 4.3).

**Fig. 5: Space of latent circuit mechanisms for context-dependent decision-making in RNNs.**

The latent circuit solutions from all RNNs formed three major clusters. RNNs in all clusters had similar task performance (Supplementary Fig. 5). The correlation coefficients between the latent connectivity of all RNNs within a cluster and the mean connectivity of that cluster was high (mean correlation coefficients: r = 0.94 cluster 1, r = 0.9 cluster 2 and r = 0.91 cluster 3; Extended Data Fig. 2), which indicates that the RNNs within each cluster had a similar circuit mechanism. To test whether circuit mechanisms varied between RNNs from different clusters, we sampled latent circuits randomly from the ensemble and fitted these circuits to responses of randomly sampled RNNs, optimizing only the embedding matrix Q while holding the latent circuit connectivity fixed. These latent models produced a significantly worse fit when the latent circuit and the target RNN were sampled from different clusters than from the same cluster (Extended Data Fig. 2m; one-sided Mann–Whitney U-test, U = 52, 593, P < 10⁻¹⁰), confirming differences in circuit mechanisms across clusters. This result further reinforces that task alone does not uniquely determine the circuit mechanism, and not any connectivity sufficient to perform the task can account for a specific set of neural responses in an RNN.

To understand how the circuit mechanisms vary across clusters, we examined the mean latent connectivity matrix of each cluster. The mean connectivity matrices revealed circuits that all showed signatures of the suppression mechanism, with context nodes inhibiting irrelevant sensory nodes (Fig. 5b). In the main cluster, the circuits were balanced and symmetric, with approximately equal strength of excitation or inhibition between nodes representing different contexts and stimulus–response mappings. This balance was reflected in the dynamics and representations of stimuli (Fig. 5c). In the two other clusters, the circuits showed asymmetry in connectivity, with stronger inhibition from context to some sensory nodes counterbalanced by stronger self-excitation for these sensory nodes. These asymmetries were consistently reflected in dynamics and the representations of stimuli, which showed a bias toward the left or right stimulus representations depending on the cluster. Although the circuit solutions in two of the clusters exploit asymmetries in the representations of sensory evidence, they still operate by an inhibitory mechanism in which irrelevant responses are suppressed (Extended Data Fig. 3).

The inhibitory suppression mechanism that we consistently found across RNNs may seem distinct from a dynamical selection vector mechanism previously identified in RNNs trained on a similar task, which apparently does not require suppression of irrelevant sensory responses¹⁹. We analyzed dynamics in our RNNs and found the same selection vector mechanism (Extended Data Fig. 4), indicating that the dynamical selection vector mechanism is a local linear description of the inhibitory circuit mechanism. Furthermore, we found the same inhibitory suppression mechanism in RNNs trained without constraining their inputs to be orthogonal (Supplementary Fig. 6) and in RNNs with different biologically plausible nonlinearities (Extended Data Figs. 5 and 6). Thus, the space of latent circuit solutions found by all our RNN models of context-dependent decision-making can be characterized by a common suppression mechanism.

Representations of irrelevant stimuli in the PFC

Our finding that RNNs use the inhibitory mechanism for context-dependent decision-making appears in conflict with previous work, which suggested that in both RNNs and the PFC, irrelevant sensory responses are not significantly suppressed^19,21,51. This conclusion was derived using dimensionality reduction methods that fit neural responses with regression models^{19,21,41,42,43} to find low-dimensional projections that best correlate with task variables (Fig. 1b). In these projections, task variables do not interact but are demixed in orthogonal dimensions. By contrast, representations of task variables in the latent circuit model interact via recurrent connectivity to implement the computations necessary to solve the task. We therefore sought to determine whether the latent circuit and regression models identify different task representations in the same PFC responses.

We fitted the latent circuit model to the same dataset of PFC recordings during context-dependent decision-making as in the previous studies^19,21. The dataset consists of several hundred PFC neurons (n = 727 and 574 for monkeys A and F, respectively) recorded from two rhesus monkeys performing a context-dependent decision-making task¹⁹ (Fig. 2a). We fitted latent circuit models to smoothed condition-averaged PFC responses during a 750-ms window starting 100 ms after the stimulus onset¹⁹ (Methods). The latent circuit model provided good fits of PFC responses projected onto the low-dimensional subspace spanned by the columns of the inferred embedding matrix Q for both monkeys (r² = 0.88 and 0.76 on test data for monkeys A and F, respectively; Extended Data Figs. 7a and 8a). The task subspace Q explained a smaller fraction of total variance in the PFC responses (11.0% and 7.0% for monkeys A and F, respectively) than in RNNs (Supplementary Fig. 2), which is comparable to previous reports of task-relevant variance in the PFC⁴² and reflects the high dimensionality of PFC activity¹⁹.

Similar to the RNNs, projecting PFC responses onto the axes identified by the latent circuit model revealed significant suppression of stimulus representations when they were irrelevant (Fig. 6a and Extended Data Figs. 9a and 10; one-sided Mann–Whitney U-test; monkey A: motion P = 1.4 × 10⁻⁹ and U = 120, color P = 10⁻¹⁰ and U = 83; monkey F: motion P = 1.5 × 10⁻⁵ and U = 277, color P = 1.6 × 10⁻⁷ and U = 194; n = 36). This suppression was not due to correlation of activity along the motion and color axes with choice (Extended Data Figs. 7b and 8b). Consistent with the suppression seen in projected PFC activity, the inferred latent circuit connectivity showed inhibitory connections from context nodes to sensory nodes representing irrelevant stimuli in each context (Fig. 6b and Extended Data Fig. 9b). We confirmed that PFC responses significantly constrained the inferred latent connectivity above the effect of the task (Extended Data Figs. 7c and 8c and Methods), indicating that the suppression mechanism reflects dynamics in the PFC data. The fact that the latent circuit model correctly performs the task and accurately fits projected PFC responses indicates that the suppression seen in the PFC data is sufficient to produce the context-dependent decision-making behavior.

**Fig. 6: Representations of irrelevant stimuli in the PFC during context-dependent decision-making.**

The latent circuit model identified a subspace in PFC activity in which representations of irrelevant stimuli were suppressed, whereas regression methods uncovered subspaces in the same PFC data in which stimuli were nearly equally represented across contexts^19,21. To confirm that this difference results from recurrent interactions among task variables in the latent circuit model, we fitted PFC data with a modified latent circuit model in which the latent recurrent connectivity was constrained to be 0. In this model, like in regression models, the latent variables do not causally influence each other. Indeed, this model found low-dimensional subspaces in which irrelevant sensory representations were less suppressed in both monkeys (Fig. 6c and Extended Data Fig. 9c), consistent with previous studies^19,21. These subspaces explained a smaller amount of total variance in PFC responses (0.6% and 1.1% for monkeys A and F, respectively) than the latent circuit model. Thus, it is possible to find both types of representations in the PFC, and the question arises about which of these possible representations are causally linked to behavior.

Although we cannot directly assess the behavioral relevance of different representations in the PFC via perturbations, we can test the inhibitory suppression mechanism using neural activity on error trials. Specifically, the suppression mechanism predicts that the representations of irrelevant stimuli (along the motion and color axes identified by the latent circuit model) should be less suppressed on error than correct trials. We tested this prediction on incongruent trials on which the relevant and irrelevant stimuli point to opposite choices (Methods) because the errors on incongruent trials are more likely to result from a failure of the contextual mechanism than other factors (for example, attention lapse). The prediction was clearly borne out by the data; in both contexts for both monkeys, the irrelevant representations were significantly less suppressed on error than correct trials (Extended Data Figs. 7d and 8d; combined condition Mann–Whitney U-test; monkey A: P = 0.0086, U = 0.313 and n = 24 on color context trials and P = 0.0002, U = 0.354 and n = 24 on motion context trials; monkey F: P = 0.0002, U = 0.333 and n = 26 on color context trials and P < 10⁻¹⁰, U = 0.375 and n = 30 on motion context trials), suggesting that the representations identified by the latent circuit model in PFC activity are related to the behavioral task execution.

Behavioral relevance of low-dimensional representations

To further test for behavioral relevance of representations identified by different dimensionality reduction methods, we again turned to RNNs in which we can measure behavioral effects of arbitrary activity perturbations. We compared projections of RNN activity onto axes obtained from the latent circuit model and a linear decoder. We trained a linear decoder to predict the signed motion coherence on each trial from RNN activity (Methods). The decoding weights provide an axis in the RNN state space such that a projection onto this axis correlates with the motion coherence. By projecting RNN responses onto the decoder axis, we find a strong representation of irrelevant motion stimulus on color context trials without noticeable suppression (Fig. 7a). Thus, as in the PFC, irrelevant sensory representations in our RNN appear not suppressed along the decoder axis, whereas they appear suppressed along the axis obtained from the latent circuit model (Fig. 7b).

**Fig. 7: Representations of irrelevant stimuli in the RNN do not drive behavior.**

How can we reconcile these qualitatively distinct perspectives on representations of irrelevant stimuli within the same RNN? We hypothesized that the appearance of irrelevant stimulus representations is possible because the linear decoder compromises behavioral relevance for decoding accuracy. To test this idea, we stimulated RNN units with activity patterns aligned with the axes obtained from the linear decoder and the latent circuit model. If the corresponding activity patterns are behaviorally relevant, we expect the stimulation to have a substantial effect on psychometric functions. Specifically, stimulating the representation of the right motion stimulus should increase the proportion of right choices, shifting the decision boundary on motion context trials. As expected, driving RNN activity along the motion axis of the latent circuit model shifted the decision boundary on motion context trials (Fig. 7d). By contrast, stimulation of the same magnitude along the decoder axis had little effect on the psychometric function (Fig. 7c). The irrelevant stimulus representations exist along the decoder axis but do not drive the behavioral output. For context-dependent decision-making, we conclude that the dynamics revealed by the decoder have little behavioral relevance and thus do not invalidate the inhibitory mechanism identified by the latent circuit model. Our results indicate that ‘demixing’ representations of task variables^19,21,42 may not be the right objective for identifying behaviorally relevant patterns in neural activity and may provide a misleading picture of computation.

Discussion

Single neurons in higher cortical areas show complex heterogeneous responses during cognitive tasks, posing a challenge for identifying mechanisms of cognitive functions. Our latent circuit model accounts for single-neuron heterogeneity via dimensionality reduction that incorporates low-dimensional circuit dynamics in its latent variables. We show that low-dimensional circuit mechanisms can explain task-relevant dynamics in high-dimensional networks and establish feasibility of inferring these mechanisms from neural response data. Our theory for interpreting the latent circuit connectivity as a low-rank connectivity in RNNs enables causally validating low-dimensional circuit mechanisms via activity and connectivity perturbations in high-dimensional networks. The latent circuit inference can be broadly applied to identify circuit mechanisms for different cognitive tasks from neural response data (Supplementary Fig. 7) and opens new possibilities for causally testing these mechanisms in future experiments.

Although previous studies extensively modeled neural responses using various latent dynamical systems, our work demonstrates feasibility of fitting neural responses with low-dimensional recurrent circuits. Most methods fitting low-dimensional dynamical systems to neural data do not explicitly model task inputs and behavior and do not ground the inferred dynamics in the underlying network connectivity^{32,33,40,52,53}. On the other hand, high-dimensional RNN architectures serve as intermediate mechanistic models of neural responses, which generate latent dynamics from high-dimensional recurrent connectivity^{25,26,27,28,29}. Although these models can incorporate task inputs and behavioral outputs^27,28,29,51, their high-dimensional connectivity is not uniquely constrained by low-dimensional data, limiting possible insights into circuit mechanisms transforming sensory inputs to behavior. Our results motivate future work incorporating low-dimensional recurrent circuits as latent dynamics generators within these model architectures, which will enable combining their ability of fitting single-trial neural activity^26,29 with the uniqueness and interpretability furnished by the latent circuit model.

By relating connectivity to neural dynamics and behavior, the latent circuit model extends the causal predictive power of classical neural circuit models to the study of mechanisms of cognitive functions in high-dimensional networks. Although dynamical mechanisms have been studied in RNNs by linearizing the RNN flow field around fixed points^19,31, these dynamical mechanisms do not specify how a particular fixed-point configuration arises from the RNN connectivity. Although dynamical mechanisms can predict changes in activity under perturbations of the dynamical system’s state or inputs, circuit mechanisms provide additional leverage as they can also predict how the dynamical system itself will change under perturbations of activity or connectivity within the network. The causal predictive power of the latent circuit model is further supported by its ability to predict RNN dynamics and behavior for out-of-distribution inputs (Supplementary Fig. 8).

We find that RNNs do not necessarily find qualitatively distinct solutions to cognitive tasks from mechanisms in low-dimensional neural circuit models. We show that low-dimensional mechanisms can be found in large RNNs if connectivity is viewed in the appropriate basis. In other words, just as dynamics can be understood in terms of latent variables⁵⁴, connectivity can be understood in terms of interactions between these latent variables. This perspective is qualitatively similar to previous work on engineering low-dimensional task solutions in RNNs with low-rank connectivity^47,49,55. However, it was unclear whether low-rank RNNs use mechanisms similar to classical circuit models or implement truly novel solutions that emerge only in high-dimensional nonlinear systems. Our work explicitly relates low-dimensional recurrent circuits and low-rank connectivity in RNNs and enables inferring latent circuit structure in generic RNNs trained without low-rank connectivity constraints. We find that these generic RNNs also use low-rank connectivity to perform the task, although their full high-dimensional connectivity is not necessarily low rank.

We found that RNNs trained on a context-dependent decision-making task use a suppression mechanism in which context nodes inhibit irrelevant sensory nodes. The inhibitory suppression mechanism revealed by the latent circuit model is qualitatively similar to previous neural circuit models of how PFC flexibly switches between alternative stimulus–response mappings^8,9,10,44,45. Thus, RNNs do not find qualitatively distinct solutions to this task, and complex selectivity of single neurons has a simple explanation as a linear mixing of the low-dimensional latent inhibitory circuit mechanism. Although we found a variety of task solutions across RNNs trained with the same hyperparameters to the same performance level, all these solutions were based on the same suppression mechanism. In general, a solution to which an RNN converges may depend on numerous hyperparameters (for example, for initialization and regularization), and the latent circuit inference offers a quantitative tool for characterizing how these hyperparameters influence the space of task solutions.

The latent circuit model revealed a suppression of irrelevant sensory representations in PFC responses of monkeys performing a context-dependent decision-making task. By contrast, correlation-based dimensionality reduction methods found no significant suppression of irrelevant stimuli in the same data^19,21. This difference results from recurrent interactions among task variables in the latent circuit model, and dimensionality reduction methods that do not incorporate these interactions are biased toward uncovering sensory representations that are not modulated by context. Considering interactions among task variables is crucial as they implement the computations necessary to solve the task, and omitting them provides a misleading picture of computation. We show that representations of irrelevant stimuli also exist in RNNs that provably implement an inhibitory suppression mechanism, but these representations do not causally drive choices. Thus, inhibitory mechanisms for cognitive flexibility are compatible with the existence of irrelevant stimulus representations in the PFC.

The latent circuit model opens a route for interpreting circuit mechanisms in high-dimensional networks. A prerequisite for a model to be interpretable is its uniqueness⁴⁰. Whereas low-dimensional data do not uniquely constrain the full high-dimensional connectivity in RNNs, we show that we can uniquely recover the latent low-rank connectivity within a high-dimensional network, which therefore can be reliably interpreted. We operationally define interpretation of the circuit as the ability to achieve prescribed behavioral effects through perturbations of specific nodes or connections (Figs. 4 and 7). In general, small recurrent circuits can generate complex dynamics that are difficult to intuit from connectivity alone⁵⁶. In such cases, we can dissect the circuit function by analyzing the effects of connectivity perturbations on the dynamics and behavior⁵⁷. Our work translates such circuit dissection methods^56,57 to high-dimensional networks in which, unlike in small circuits, individual connections do not carry specific functions, but instead these functions arise from distributed connectivity patterns.

Interpreting circuit mechanisms is more accessible in low-dimensional tasks, which have been extensively used to probe functions of higher cortical areas. Cumulative findings suggest that similar dynamical and circuit motifs are used across many tasks, and, moreover, solutions to more complex tasks can arise by composing simple motifs^1,24,58. Consistent with this idea, we find that RNNs trained on different tasks use a similar inhibitory control mechanism (Supplementary Fig. 7), which may generalize to other cognitive control tasks as well. Therefore, low-dimensional circuit mechanisms may provide the primitives for building more complex cognitive functions, and our work suggests a path forward for interpreting these mechanisms in neural data.

Methods

Fitting a latent circuit model

We fit the latent circuit model Eqs. (1)–(3) to neural response data y by minimizing the mean squared error loss function,

$$L=sum _{k}sum _{t}parallel {y}_{tk}-Q{x}_{tk}{parallel }_{2}+parallel {z}_{tk}-{w}_{{rm{out}}}{x}_{tk}{parallel }_{2},$$

(7)

using custom Python code⁵⁹. Here, k indexes the trials, t indexes the time within a trial, and Q is an orthonormal embedding matrix. Because the variable x depends implicitly on the latent circuit parameters w_rec and w_in, the minimization of L is a nonlinear least squares optimization problem⁶⁰ in which we simultaneously search for a behaviorally relevant projection of the high-dimensional activity and a low-dimensional neural circuit that generates dynamics in this projection. Because orthonormal matrices define a nonlinear submanifold within the space of all matrices, minimizing L corresponds to solving a constrained optimization problem over this submanifold. To transform it into an unconstrained problem, we use the Cayley transform to parameterize orthonormal matrices by the linear space of skew symmetric matrices⁶¹,

$$Q=(I+A){(I-A)}^{-1}{pi }_{n},$$

(8)

where π_n represents projection onto the first n columns, and A is skew symmetric. We parameterize A by an arbitrary square N × N matrix B,

$$A=B-{B}^{T}.$$

(9)

With these reparameterizations, we can minimize L over the vector space of square matrices B. The parameterization of a skew symmetric matrix A with the auxiliary matrix B has a degeneracy because A has only N(N − 1) / 2 distinct elements. We did not attempt to eliminate this degeneracy because B is an auxiliary matrix, and we did not observe any degeneracy arising in matrix Q during fitting.

At each step of the optimization, we generate a set of trajectories x from the latent circuit dynamics and embed these trajectories using the matrix Q. The parameters B, w_rec, w_in and w_out are then updated to minimize L. We perform this minimization using PyTorch and the Adam optimizer with default values 0.9 and 0.999 for the decay rate of the first and second moment estimates, respectively, a learning rate of 0.02 and a weight decay of 0.001. We use a minibatch size of 128 trials. We stop the optimization when the loss has not improved by a threshold of 0.001 after a patience of 25 epochs. We used the Python software package Seaborn for visualizing model parameters and responses after training.

We initialize the recurrent matrix w_rec from a uniform distribution centered on 0 with a standard deviation of 1/n. We initialize w_in with zeros except for positive entries along the diagonal on connections from inputs u to their corresponding nodes and w_out with zeros except for positive entries on connections from choice nodes to their corresponding outputs z. We initialize the entries of matrix B from a uniform distribution on [0, 1].

When fitting the latent circuit model, we found some amount of variability in solutions across multiple optimization runs with different initialization. To control for this variability, we fitted a large ensemble of latent circuit models (n = 100) with different initialization of the parameters for latent connectivity and embedding Q for each RNN model. This ensemble of latent circuit models for a single RNN has variable fit quality because many optimization runs do not converge to the optimal solution (which is typical for nonconvex optimization). Therefore, we selected the best ten latent circuit models from this ensemble in terms of fit quality on held out test data, which formed a set of converged solutions (Fig. 5a). To quantify the uniqueness of the latent circuit solution in each RNN, we computed the correlation coefficients between the recurrent connectivity weights of the best model and the remaining nine converged models. We can use the correlation coefficient because the identity of each node in the latent circuit is defined by its input and output connectivity, eliminating any permutation symmetries.

Testing the dependence of latent connectivity on neural responses

To determine whether the inferred latent circuit connectivity significantly depends on neural response data beyond the constraints imposed by the task alone, we performed a permutation test (Extended Data Figs. 1, 7c and 8c), which proceeds in three steps. First, we fit N latent circuit models to neural responses and select the best model in terms of fit quality on held out test data. We then compute the correlation coefficients between recurrent connectivity of the best model and all other models. The distribution of these correlation coefficients estimates how variable the latent connectivity is across models fitted to the original neural responses. Second, we shuffle neural responses N times and fit a latent circuit model to each shuffle, resulting in N latent circuit models. Our shuffling procedure randomly permutes neural responses with respect to trial conditions while preserving the input–output relationship on each trial so that the fitted latent circuit models can still perform the task. We confirm that the latent circuit models fitted to the shuffled RNN responses perform the task at high accuracy (Extended Data Fig. 1). These latent models serve as a control to assess whether the inferred latent connectivity emerges merely from the task constraints alone, and they should not be viewed as models of any specific high-dimensional network. We then compute the correlation coefficients between the connectivity of all models fitted to the shuffled data and the best model from the original data fit. Third, we use a Mann–Whitney U-test to determine whether the correlation coefficients are significantly smaller for models fitted to the shuffled responses than original neural responses. This outcome would indicate that models fitted to shuffled neural responses use more diverse connectivity to perform the task than models fitted to the original data; thus, neural responses significantly constrain the inferred connectivity above the effects of the task. We used the same test for both RNN (N = 500; Extended Data Fig. 1) and PFC data (N = 1,000; Extended Data Figs. 7c and 8c).

Relationship between connectivity of the RNN and latent circuit

We consider RNNs of the form

$$tau dot{y}=-y+{left[{W}_{{rm{rec}}},y+{W}_{{rm{in}}}uright]}_{+}.$$

(10)

Here, [⋅]₊ is a rectified linear (ReLU) activation function, τ is a time constant, and u are external task inputs. W_rec and W_in are the recurrent and input connectivity matrices, respectively. We read out a set of task outputs z from the network activity via the output connectivity matrix W_out,

$$z={W}_{{rm{out}}},y.$$

(11)

We derive a relationship between the connectivity matrices of the RNN and latent circuit, which allows us to interpret the latent circuit connectivity as a latent connectivity structure in the RNN. To derive this relationship, we differentiate the embedding Eq. (1) with respect to time and obtain the relationship between the vector fields of the RNN and latent circuit,

$$V(,y)=Qv(x).$$

Here, the vector fields (dot{y}=V(,y)) of the RNN and (dot{x}=v(x)) of the latent circuit are given by Eqs. (10) and (2), respectively. This equation states that the subspace spanned by the columns of Q is an invariant subspace of the high-dimensional system; that is, the vector field at any point in this subspace lies entirely in this subspace. We then use the orthonormality condition Q^TQ = I to obtain

$${Q}^{T}V(Qx)=v(x).$$

Substituting the vector fields Eq. (10) and Eq. (2) in this relation gives us the equality

$${Q}^{T}{[{W}_{{rm{rec}}}Qx+{W}_{{rm{in}}}u]}_{+}={[{w}_{{rm{rec}}}x+{w}_{{rm{in}}}u]}_{+}.$$

(12)

Because this is an equality of two piecewise-linear systems, it holds for each local linear piece individually. In particular, assuming that both the inputs u and W_in are positive, we take x sufficiently near 0, where the argument of the nonlinearity is positive for all units. In this local linear piece, we have the equality of linear systems

$${Q}^{T}{W}_{{rm{rec}}}Qx+{Q}^{T}{W}_{{rm{in}}}u={w}_{{rm{rec}}}x+{w}_{{rm{in}}}u.$$

(13)

If we assume that this equality holds in some open set, then we can equate the terms to obtain an equality of connectivity matrices:

$${Q}^{T}{W}_{{rm{rec}}}Q={w}_{{rm{rec}}},$$

(14)

$${Q}^{T}{W}_{{rm{in}}}={w}_{{rm{in}}}.$$

(15)

This assumption is likely not fully satisfied in the setting of cognitive tasks because the sets of inputs u and latent states x are typically low dimensional. Therefore, the above equalities may hold only approximately. In addition, the equality of piecewise-linear dynamical systems Eq. (12) depends on the correspondence between trajectories of the RNN and latent circuit Eq. (1). Because, in practice, we search for the latent circuit by minimizing the loss function L, if L is not exactly equal to 0, then Eq. (1) and consequently Eqs. (14) and (15) hold only approximately.

We derived the analytical relations between connectivity in the latent circuit and RNN (Eqs. (14) and (15)) assuming that the latent circuit provides a good fit of RNN responses and that their dynamical equations (Eqs. (2) and (10)) have the same nonlinearity. In general, it is unclear whether a latent circuit model can satisfactorily fit responses of a high-dimensional network that has a different nonlinearity and to what extent the relation between their connectivity will hold in this case. To test whether our results extend to networks with a different biologically plausible nonlinearity, we trained RNNs that had a Softplus activation function (f(x)=frac{g}{beta }log (1+{e}^{beta x})) for a range of parameter β and also with varying gain g across units. We fitted responses of these RNNs with our latent circuit model that had a rectified linear (ReLU) activation function and found that this architecture mismatch did not significantly affect the fit quality and the relationship between connectivity (Extended Data Figs. 5 and 6).

To understand how perturbations of connectivity in the latent circuit map onto the RNN, we view perturbations as vectors in the space of matrices. We denote A ⋅ B the dot product between the matrices A and B represented as vectors in the space of matrices; that is, A ⋅ B = ∑_i∑_jA_ijB_ij. Using Eqs. (14) and (15), we then translate connectivity perturbations from the latent circuit to the RNN:

$${w}_{ji}=wcdot {delta }_{ji}={({Q}^{T}WQ)}_{ji}$$

(16)

$$=mathop{sum }limits_{k=1}^{N}{Q}_{kj}left(mathop{sum }limits_{l=1}^{N}{W}_{kl}{Q}_{li}right)$$

(17)

$$=mathop{sum }limits_{k=1}^{N}mathop{sum }limits_{l=1}^{N}{W}_{kl}{Q}_{kj}{Q}_{li}$$

(18)

$$=mathop{sum }limits_{k=1}^{N}mathop{sum }limits_{l=1}^{N}{W}_{kl}{({q}_{j}{q}_{i}^{T})}_{kl}$$

(19)

$$=Wcdot {q}_{j}{q}_{i}^{T}$$

(20)

$$=Wcdot Q{e}_{j}{(Q{e}_{i})}^{T}$$

(21)

$$=Wcdot Q{delta }_{ji}{Q}^{T},$$

(22)

where q_i is the ith column of Q, and e_i is the ith standard unit vector. This chain of equalities shows how to translate perturbations of the latent circuit connectivity in the direction δ_ji onto rank-one connectivity perturbations in the RNN,

$$wcdot {delta }_{ji}=Wcdot Q{delta }_{ji}{Q}^{T}.$$

(23)

Thus, to perturb the latent connection w_ji, we perturb the matrix W in the direction Qδ_jiQ^T. In other words, to increase the dot product between W and Qδ_jiQ^T in the space of matrices, we add multiples of Qδ_jiQ^T to W. Any perturbation orthogonal to Qδ_jiQ^T does not change the dot product and hence has no effect on the latent connection w_ji.

RNN simulations

We simulate dynamics of time-discretized RNNs using the general framework for modeling cognitive tasks²². We consider RNNs with positive activity and N = 50 recurrent units. We obtained the same results with networks consisting of N = 150 units. We discretize the RNN dynamics Eq. (10) using the first-order Euler scheme with a time step Δt and add a noise term to obtain

$${y}_{t}=(1-alpha ){y}_{t-1}+alpha {left[{W}_{{rm{rec}}}{,y}_{t-1}+{W}_{{rm{in}}}{u}_{t}+sqrt{frac{2}{alpha }}{sigma }_{{rm{rec}}}{xi }_{t}right]}_{+}.$$

(24)

Here, α = Δt/τ and ({xi }_{t} sim {mathcal{N}}(0,1)) is a random variable sampled from the standard normal distribution. We set the time constant τ = 200 ms, the discretization time step Δt = 40 ms, and the noise magnitude σ_rec = 0.15. When fitting RNN responses with the latent circuit model, we discretize the latent circuit dynamics Eq. (2) using the same hyperparameter α and the same noise magnitude as was used when training the RNN. The input and output matrices are constrained to have positive entries. The recurrent matrix is constrained to satisfy Dale’s law with 80% excitatory units and 20% inhibitory units. For RNNs shown in the main text, the concatenation of input and output matrices is constrained to be orthogonal. However, our conclusions do not depend on this constraint, and we find similar latent circuit fits and the inhibitory mechanism in RNNs trained with unconstrained inputs (Supplementary Fig. 6). The RNN simulation and training were implemented in Python using the software package PyTorch.

Context-dependent decision-making task

In the context-dependent decision-making task, at the beginning of each trial, a context cue briefly appears to indicate either the color or motion context for the current trial. After a short delay, a sensory stimulus appears that consists of motion and color features. The right motion and red color are associated with the right choice, and the left motion and green color are associated with the left choice. The strength of motion and color stimuli varies from trial to trial as quantified by the motion and color coherence. In the color context, the choice should be made according to the color, ignoring the motion stimulus, and vice versa in the motion context.

To model the context-dependent decision-making task, the network receives six inputs u corresponding to two context cues (u_m: motion context; u_c: color context) and sensory evidence streams for motion (u_m,L: motion left; u_m,R: motion right) and color (u_c,R: color red; u_c,G: color green). The network has two outputs, z₁ and z₂, for which we define two targets z_target,1 and z_target,2. Each trial begins with a presentation of a context cue from t = 320 to t = 1, 000 ms. On motion context trials, the cue input is set to u_m = 1.2 and u_c = 0.2 and vice versa on color context trials. During this epoch, we require that the network does not respond on the outputs by setting z_target = 0.2. After a delay of 200 ms, so that the network must maintain a memory of the context cue, the inputs corresponding to motion and color sensory evidence are presented at t = 1,200 ms for the remaining duration of the trial. From 2,250 ms after the start of the trial and extending to the end of the trial, the targets are defined by z_target,1 = 1.2 and z_target,2 = 0.2 for right choices and vice versa for left choices. The strength of sensory evidence for motion and color varies randomly from trial to trial controlled by the stimulus coherence. We use motion coherence m_c and color coherence c_c ranging from −0.2 to 0.2 chosen from the set {−0.2, −0.12, −0.04, 0.04, 0.12, 0.2}. For each coherence level, the motion and color inputs are given by

$$begin{array}{rcl}{u}_{{rm{m}},{rm{L}}}=frac{1-{m}_{{rm{c}}}}{2},&&;;{u}_{{rm{m}},{rm{R}}}=frac{1+{m}_{{rm{c}}}}{2},\ {u}_{{rm{c}},{rm{G}}}=frac{1-{c}_{{rm{c}}}}{2},&&;;{u}_{{rm{c}},{rm{R}}}=frac{1+{c}_{{rm{c}}}}{2}.end{array}$$

With these definitions, positive motion and color coherence provide evidence for the right choice, and negative motion and color coherence provide evidence for the left choice. At each simulation time step, we add an independent noise term to each of the inputs ({u}_{{rm{noise}}}=sqrt{2{alpha }^{-1}}{sigma }_{{rm{in}}}{eta }_{t}), where ({eta }_{t} sim {mathcal{N}}(0,1)) is a random variable sampled from the standard normal distribution. The input noise strength is σ_in = 0.01. A baseline input u₀ = 0.2 is added to each of the inputs at each time step.

RNN training

To train the RNN, we minimize the mean squared error between the output z(t) and the target z_target(t):

$${mathcal{L}}:= sum _{ikt}{({z}_{ikt}-{z}_{{rm{target}},ikt})}^{2}+{lambda }_{r}sum _{ikt}{y}_{ikt}^{2}.$$

(25)

Here, k is the trial number, t is the time step within a trial, z_ikt is the ith output on trial k and time t, and y_ikt is the response of the ith RNN unit on trial k at time t. The first term is the task error, and the second term serves to regularize by penalizing the magnitude of the firing rates. To encourage the network to integrate sensory evidence over time and to not output responses during the context cue, these task errors are only penalized in the last 750 ms of each trial and during the presentation of the contextual cue. The training is performed with the Adam algorithm. We used the default values 0.9 and 0.999 for the decay rate of the first and second moment estimates, respectively. We used a learning rate of 0.01 and a weight decay of 0.001 and set the hyperparameter λ_r = 0.05.

We control the degree of correlation between the input and output vectors in the RNN by adding an L₂ penalty

$${lambda }_{{rm{orth}}}parallel {B}^{T}B-,text{diag},({B}^{T}B){parallel }_{2}$$

(26)

to the loss function in Eq. (25) during training. Here, B is the matrix corresponding to the concatenation of W_in and ({W}_{{rm{out}}}^{T}) along their second dimension, with columns normalized to unit length. The hyperparameter λ_orth controls the penalty weight. For RNNs in the main text, we set λ_orth = 1, which results in nearly orthogonal input vectors (Supplementary Fig. 6). We fit responses of these RNNs with latent circuit models in which the matrix B is constrained to be diagonal during fitting by setting off-diagonal elements to 0 after each gradient update. By setting λ_orth to a smaller value during RNN training, the input vectors in the trained RNN become slightly correlated (Supplementary Fig. 6). To test the effect of these correlations in the latent circuit model, we add the penalty Eq. (26) to the loss function Eq. (7) during latent circuit fitting (Supplementary Fig. 6). These correlations can be captured in the latent circuit model fitted with smaller values of the corresponding λ_orth hyperparameter. Allowing for these input correlations in RNNs and the latent circuit does not have a strong effect on either fits or the underlying circuit mechanism (Supplementary Fig. 6).

The recurrent connection matrix W_rec is initialized so that excitatory connections are independent Gaussian random variables with mean (1/sqrt{N}) and variance 1/N. Inhibitory connections are initialized with mean (4/sqrt{N}) and variance 1/N. The matrix is then scaled so that its spectral radius is 1.5. To implement Dale’s law, connections are clipped to 0 after each training step if they change sign. During training, we used minibatches of 128 trials with 1,800 trials total.

To assess performance, a choice for the RNN was defined as the sign of the difference between output units at the end of the trial. Psychometric functions were then computed as the percentage of choices to the right for each combination of context, motion coherence and color coherence.

Linear decoding

To decode motion coherence from RNN responses, we fit a linear regression model

$$c=beta y+b,$$

(27)

where (beta in {{mathbb{R}}}^{1times N}) is the vector of regression coefficients, (cin {{mathbb{R}}}^{1times Kcdot T}) is the motion coherence on each trial, (bin {mathbb{R}}) is a bias term, and (yin {{mathbb{R}}}^{Ntimes Kcdot T}) is the RNN responses at each time step during the stimulus epoch of each trial. Here, K is the number of trials, and T is the number of time points within a trial. We split the data into training and test sets and fit the model on the training set. There was no large difference between training and test scores (r² = 0.535 and r² = 0.531), suggesting that the model did not overfit. After fitting, we used the vector of regression coefficients β to define the decoder axis on which we project RNN responses.

Analysis of PFC data

We analyzed a publicly available dataset of neural activity recordings from the PFC (in and around the frontal eye field) from two monkeys performing a context-dependent decision-making task¹⁹. This dataset consisted of 762 units from monkey A and 640 units from monkey F (including single neurons and multiunits). To facilitate comparison with previous studies analyzing the same dataset^19,21, we used identical initial preprocessing of the neural data (using the publicly available code at https://www.ini.uzh.ch/en/research/groups/mante/data.html). Because stimulus coherence levels varied across monkeys and days, to equate performance in the motion and color contexts, we replaced the coherences on each trial with their average values for each stimulus difficulty (average motion coherences: 0.05, 0.15 and 0.50 in monkey A and 0.07, 0.19 and 0.54 in monkey F; average color coherences: 0.06, 0.18 and 0.50 in monkey A and 0.12, 0.30 and 0.75 in monkey F). Monkeys reported their choice with a saccade to one of two targets presented shortly after fixation for the entire trial duration. The monkeys were rewarded for saccades to the target location corresponding to the motion direction in the motion context and to the target whose color matched the dominant color of the dots in the color context. The stimulus coherence was assigned a sign (positive or negative) according to the target location indicated by the stimulus. Because the color of the targets was randomized between locations on each trial, the sign of the color coherence reflects both the dominant color of the dots and the location of the red and green targets. The task therefore had 72 unique stimulus conditions defined by all combinations of six motion coherence levels, six color coherence levels and two contexts.

We fitted the latent circuit model to trial-averaged neural responses on correct trials. In our analyses, we included neurons that had at least four correct trials for each of the 72 unique trial conditions, which produced 483 neurons for monkey A and 323 neurons for monkey F. For cross-validation, we then split the trials into two equal disjoint sets and computed the trial-averaged response of each neuron for each trial condition within each set. We used the training set for model fitting and the validation set for visualizing projections of neural responses and quantifying the fit quality. For the analysis of error trials (Extended Data Figs. 7d and 8d), we considered the set of error trial conditions for which all analyzed neurons had at least one trial, which resulted in 16 conditions for monkey A and 26 conditions for monkey F. We then computed the trial-averaged response of each neuron for each trial condition within this set of error trials.

We analyzed neural responses during the presentation of the random dots stimulus because the available data consisted of neural responses starting at 100 ms after stimulus onset for a duration of 750 ms. For each trial, we computed time-varying firing rates by counting spikes in a 50-ms sliding square window (50-ms steps). The first window was centered at 100 ms after the onset of the stimulus, and the last window was centered at 100 ms after stimulus offset. Within the training and test sets, we z scored and smoothed (Gaussian kernel, σ = 40 ms) the response of each unit. Following previous studies¹⁹, from activity of each unit we subtracted a condition-independent term corresponding to the mean response at each time across trial conditions. To construct population responses, we combined the single-neuron responses for each trial condition. This resulted in 72 neural trajectories for each combination of context, motion coherence and color coherence. Last, to denoise these trajectories, we projected them onto the principal components explaining 50% of their total variance (corresponding to the first 40 and 31 principal components for monkeys A and F, respectively).

We fitted latent circuit models to the PFC data following similar procedures as for RNNs. For each of the 72 conditions, we constructed input to the latent circuit from the context, motion and color coherence corresponding to that condition. In the experimental task, the stimulus is presented 650 ms after the context cue for a duration of 750 ms. Neural recordings correspond to 100 ms after stimulus onset to 100 ms after stimulus offset. We thus modeled the task with 150 time steps (10 ms in length) extending from the initial presentation of the contextual cue to 100 ms after stimulus offset. Contextual input was given to the model from t = 0 to t = 1,500 ms. Stimulus input was given to the model from t = 750 ms to t = 1,500 ms. We constructed two target outputs (z₁ and z₂) for each trial such that on trials for which the monkey chose the right target, the first target output was high (z₁ = 1.2) and the second target output was low (z₂ = 0.2) and vice versa for the left choice trials. We penalized errors between target and model outputs only in the last 250 ms of each trial. Responses of the latent circuit were fitted to the PFC data only on the last 15 time steps of each trajectory for which there were available PFC data. The latent circuit model was fitted with hyperparameter α = 0.2. The latent circuit model was fitted with a recurrent noise term of magnitude σ_rec = 0.15, which was added to each unit at each time step (Eq. (24)). Because neural responses were centered, we additionally fit an intercept term b so that the resulting model for PFC data was

$$y=Qx+b,$$

(28)

$$tau dot{x}=-x+{[{w}_{{rm{rec}}}x+{w}_{{rm{in}}}u]}_{+},,{rm{and}}$$

(29)

$$z={w}_{{rm{out}}}x.$$

(30)

Because of high dimensionality of PFC responses (40 and 31 principal components are required to account for ~50% of the total variance in PFC activity for monkeys A and F, respectively), we find a notable tradeoff between the task fit and data fit when fitting the low-dimensional latent circuit model to the PFC data. To control this tradeoff, we used a modified loss function when fitting PFC data,

$$L=sum _{k}sum _{t}lambda frac{parallel y-Q{Q}^{T}y{parallel }_{2}}{parallel y{parallel }_{2}}+frac{parallel {Q}^{T}(,y-b)-x{parallel }_{2}}{parallel {Q}^{T}y{parallel }_{2}}+frac{parallel z-{w}_{{rm{out}}}x{parallel }_{2}}{parallel z{parallel }_{2}}$$

(31)

$$=lambda {r}_{Q}^{2}+{r}_{x}^{2}+{r}_{z}^{2},$$

(32)

designed to balance variance explained by the task-relevant subspace ({r}_{Q}^{2}), the fits between projected PFC responses and the latent circuit trajectories in this subspace ({r}_{x}^{2}) and the performance of the latent circuit on the task ({r}_{z}^{2}). The hyperparameter λ = 0.5 was chosen via a grid search over the range λ ∈ [0, 1.5]. We found that near this value, the metrics ({r}_{x}^{2}) and ({r}_{Q}^{2}) were maximized under the constraint that the latent model still performed well on the task (Extended Data Figs. 7a and 8a).

Statistics and reproducibility

We analyzed data from 200 RNN models trained with random initializations. Results were consistent across networks; therefore, we found this sample size to be sufficient for our study. No statistical method was used to predetermine sample size. For each of these networks, we trained 100 latent circuit models. This sample size was chosen so that the top ten latent models converged to a high fit accuracy. For PFC data, we fitted 200 latent circuit models to neural responses from each monkey. Neural recording data were previously described in Mante et al.¹⁹; no randomization or blinding was performed because there was only one experimental group. All recorded units that had at least four correct trials in each task condition were included in the analysis.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.