Projected runoff declines from plant physiological effects on precipitation

Main
Climate warming will alter runoff, generating hazards for people and ecosystems. Major uncertainties in the future of the hydrologic cycle under warming are thus one of the most important challenges in devising sound adaptation strategies1,2,3. A key contributor to hydrologic uncertainty is how plants will respond to higher atmospheric CO2 concentrations2,4,5,6 (CO2 physiological effects). More water-efficient carbon uptake by plants will probably alter leaf area and canopy conductance, which together affect water fluxes from the land to the atmosphere (evapotranspiration). Because those fluxes link the land and atmosphere, CO2 physiological effects may influence precipitation patterns7,8,9 and the surface water balance and runoff5.
A lineage of studies has used Earth system models (ESMs) to suggest that reduced transpiration from stomatal closure in response to high CO2 may leave more water on the land surface, thereby increasing runoff5,9,10,11,12,13,14,15,16,17,18. The expectation is that this plant-induced wetting will offset some of the drying caused by warmer air temperatures over land and modulate the impact of radiatively driven precipitation change. Yet the mechanism of this transpiration–runoff effect is poorly constrained in models, leading to substantial inter-model uncertainty in runoff. The physiological impacts of CO2 on hydrology are often inferred from idealized simulations using a single ESM12,15 or from multi-model means across an ESM ensemble16,18,19. However, each ESM represents the land surface, the atmosphere and their interactions somewhat differently, implying a diversity of potential mechanisms underpinning plant-driven hydrologic responses. Research on other climate drivers and impacts has widely leveraged the diversity of ESM responses to rising CO2 as a measure of uncertainty in ESM-projected future climates20,21,22 and as a window into culpable mechanisms driving that uncertainty23,24,25. Yet, such diversity is underexplored for CO2 physiological effects on future runoff14,17,26, with two critical and related implications for hydroclimate science, impacts and adaptation.
Firstly, underexplored variation across ESM simulations translates to poorly constrained uncertainty over potential runoff increases from CO2 physiological effects alone. For instance, large but opposing signals in key hydrologic variables across the ensemble may cancel in the ensemble mean, yielding an inaccurate impression of low change and concealing model divergence. Secondly, to the extent that such uncertainties arise from differing representations of key processes, understanding the drivers of ESM spread requires a mechanistic examination of individual model realizations. Crucially, an ensemble mean of ESM runs is not itself a physically consistent run: optimizations and budget closures that are enforced within individual ESMs may be obscured in analyses of ensemble means. Perhaps as a result, the mechanisms by which CO2-induced transpiration responses translate into runoff changes remains obscure5,12,14.
In this study, we clarify the mechanism and robustness of runoff change in a set of 12 ESM simulations that isolate plant physiological and biogeochemical responses to increased CO2 (Supplementary Table 1). We centre our analysis on the biogeochemical (BGC) simulations from the Coupled Climate–Carbon Cycle Model Intercomparison Project or C4MIP27. These BGC runs capture carbon cycle response to increasing atmospheric CO2 (1% per year for 140 years, to a quadrupling over pre-industrial levels) while keeping CO2 constant in model’s radiation code. We focus on these BGC experiments (rather than the other experimental treatments within the C4MIP, namely the fully coupled control and radiative runs) for two reasons. First, the BGC runs are useful to diagnose hydrologic uncertainties specifically owing to plant responses to enhanced CO2 without the confounding effects of warming. Second, the BGC experiments are increasingly used to interpret fully coupled ESM projections of future water availability and to inform long-term water management decisions. As such, diagnosing the model diversity in ecohydrologic responses to increased CO2 absent warming is essential to benchmark the models, evaluate their robustness and identify paths to model improvement. We calculate changes in 30-year mean runoff, precipitation and transpiration (aggregated to water-year totals and denoted by Δ) as the difference between the last and first 30 years of each simulation. We apply a modest definition of ensemble agreement for a change to qualify as ‘robust’ within the BGC ensemble: at least two-thirds of the models must agree on the sign of change that is at least ¼ a standard deviation in magnitude.
Through an analytic approach that diagnoses model differences in these BGC simulations, we present three key findings. First, we show that plant-driven runoff increases are robust only over a small portion of global land area, despite robust and widespread declines in transpiration. An examination of surface water balance changes reveals that within individual BGC models, increases and decreases in runoff are equally likely given reduced transpiration, with precipitation changes largely determining the direction of runoff change. Secondly, we use pre-industrial control simulations to show that these BGC-only precipitation changes reflect a mix of internal climate variability and significant forced responses to CO2 physiological effects. Lastly, using a typology that interprets CO2 physiological effects on runoff in the context of precipitation changes, we demonstrate that plant-induced reductions in runoff are at least as widespread as plant-induced enhancements, even absent the runoff impacts from climate warming. Together, these results imply a heterogenous and uncertain role for plant stomatal responses to CO2 in shaping surface hydrology and future water availability.
ESMs disagree on transpiration-driven runoff increases
The spatial pattern of ensemble agreement across hydrologic variables indicates large uncertainty in the consequences of transpiration changes for runoff under CO2 physiological effects (Fig. 1 and Extended Data Figs. 1–3). Robust transpiration declines are ubiquitous in the BGC runs, with ensemble agreement on reductions over 57% of land area (Fig. 1a and Extended Data Fig. 1). Model consensus on declining transpiration indicates that the effect of stricter stomatal regulation outweighs that of increased leaf areas resulting from CO2 fertilization, which is robust across the vast majority of land areas (Extended Data Fig. 2). However, the BGC ensemble disagrees on how these robust physiologically driven transpiration declines translate to plant-driven runoff changes. For example, the simulations disagree on the sign of runoff change over a large majority of land area (83%), including across 90% of the area with ensemble agreement on transpiration decreases (Fig. 1a,c,d). In fact, the ensemble agrees on runoff increases over only 5% of global land area, comprising minorities of the Andes (7%), East Asia (16%), Southeast Asia (17%) and central and western Africa (34%) regions (Fig. 1c,d and Extended Data Fig. 4). Whereas these regions have been identified previously as hotspots of runoff enhancement under CO2 physiological effects in earlier model generations12,15,18,19, our analysis reveals prevailing ensemble disagreement on these runoff responses, both globally and locally within these hotspot regions.

a–c, Agreement on sign of changes in water-year total transpiration (ΔTr) (a), precipitation (ΔP) (b) and runoff (ΔQ) (c) under a quadrupling of atmospheric CO2 concentration under the BGC experiments. These experiments isolate the hydrologic impacts of vegetation responses to CO2 and do not include the effects of warming. Across the ensemble of 12 ESMs, grid cells are classified on the criterion that at least 2/3 of models agree on the sign change. The threshold for change is set at one-quarter of a standard deviation of interannual variability under recent atmospheric CO2 concentrations (first 30 years of the simulation; Methods). Numerical annotations indicate the fraction of non-barren land area falling into ensemble agreement classifications (agreement on a decrease in brown; on an increasing in blue; and total agreement including low change in black). Barren areas are defined as places where water-year mean leaf area index is <0.1 and are shaded in pale grey. d, Ensemble agreement among ΔTr, ΔP and ΔQ, expressed as a fraction of non-barren land area, globally and within selected illustrative regions.
Precipitation changes obviously shape the extent to which transpiration declines translate into runoff enhancement11,19,28. However, we find widespread ensemble disagreement over both precipitation changes and their impact on runoff under CO2 physiological effects. The BGC simulations disagree on the sign of precipitation change over 70% of land areas (Fig. 1b and Extended Data Fig. 3), suggesting potentially meaningful but highly uncertain precipitation responses. Notably, the ensemble also disagrees on the location of runoff changes even where models agree on declining precipitation. For example, whereas the ensemble agrees on precipitation declines over 57% of the Amazon region (Fig. 1b,d), it only agrees on attendant runoff declines over 8% and otherwise disagrees on the direction of runoff change (Fig. 1c,d). We find a similar pattern of divergent runoff responses despite robust precipitation and transpiration declines in much of northwestern Eurasia. Further, the ensemble does not agree on compensating reductions in transpiration and precipitation, evidenced by minimal agreement on low runoff change. In other regions, such as northeastern Eurasia, central North America and southeastern Africa, models agree on declining transpiration but disagree on both precipitation and runoff responses (Fig. 1). There is thus substantial inter-model uncertainty in the impact of transpiration changes on precipitation and runoff under CO2 physiological effects.
In contrast to BGC runs, there is far more ensemble agreement across hydrologic variables for the radiative (RAD) and fully coupled control (CTL) counterparts in the C4MIP (Extended Data Fig. 5). For RAD, rising CO2 is presented only to the atmosphere and not the land surface, whereas for CTL, CO2 increases impact all model components. In both the RAD and CTL experiments, the ensemble agrees on the sign of hydrologic changes over 47–66% of land area (across variables and experiments). The sign of agreement is generally collocated across the three variables for CTL and RAD (increases in northern high latitudes, most of Asia, central Africa and southeast South America; decreases in the Mediterranean, southern Africa and the Amazon). These results point to an ensemble consensus that radiatively driven precipitation changes largely determine the sign of runoff changes under fully coupled warming29. They further highlight the comparatively large uncertainty in the hydrologic impacts of CO2 physiological effects in the BGC.
In contrast to earlier work that has emphasized the runoff increases from stomatal closure under high CO2, our results suggest there is little ensemble consensus on whether and where stomatal-driven transpiration declines translate into runoff or precipitation changes under rising CO2. Runoff is robustly enhanced by reduced transpiration in small portions of some tropical regions12,15,18,19, representing a small fraction of global land area. However, the overall global picture is of uncertain hydrologic responses to CO2-driven transpiration changes across models. The pattern of discordant transpiration, precipitation and runoff changes across the BGC ensemble implies that inter-model runoff uncertainty arises not from the transpiration response itself but rather from the mechanism by which it does (or does not) translate into runoff changes. Because long-term runoff changes approximate the difference between precipitation and evapotranspiration (ET) changes30, two causal paths may explain why reduced transpiration does not generally correspond to increased runoff. First, decreases in precipitation may compensate transpiration-driven increases in runoff7,8,31. Second, increases in other components of ET (notably, bare-soil, Es, and leaf evaporation, EL) may compensate decreases in transpiration, breaking the transpiration–runoff response32,33.
Precipitation drives CO2 physiological runoff changes
To examine which of these causal paths explains the weak ensemble relationship between transpiration declines and runoff responses from CO2 physiological effects, we develop a typology to classify grid cells as consistent or inconsistent with a transpiration–decrease/runoff–increase hypothesis in the BGC experiments (Fig. 2 and Extended Data Fig. 6). For each ESM, we find that increases and decreases in runoff are equally widespread across land areas with declining transpiration (ensemble averages of 39 versus 36%, dark blue and light brown bars; Fig. 2a). However, the ensemble largely disagrees on where these changes occur, with a robust shared response of collocated transpiration decreases and runoff increases over only 3% of land area, predominantly tropical (Fig. 2b and Extended Data Fig. 6). Runoff increases and decreases are also roughly equally widespread over regions where transpiration increases (ensemble mean 24%, 8–39% across ESMs, light blue and dark brown bars; Fig. 2a). Yet there is virtually no ensemble agreement on the location of these joint changes (Fig. 2b).

a, Fraction of global land area within each ESM with concordant and discordant increases and decreases in water-year runoff (Q) and transpiration (Tr) under a quadrupling of CO2 in the physiology-only (BGC) experiment. b, Ensemble agreement on the joint sign of transpiration and runoff responses (≥ 2/3 model agreement on changes ≥σ/4 relative to recent CO2). Annotations denote fractions of land area with ensemble agreement on joint changes. Areas with ensemble disagreement are shown in dark grey; areas with agreement on low Q change (< σ/4 are shown in pale yellow); barren areas are masked in light grey. c–f, Water budget changes for each ESM averaged across pooled non-barren model grid cells falling within each class (points). The horizontal dash indicates the ensemble mean. P, EL and ES denote precipitation, leaf evaporation and soil evaporation. g, Spatial pattern correlation between water-year ΔQ and changes in four water balance terms for each ESM (points). All changes are expressed as a fraction of recent CO2 mean total P to minimize the influence of baseline P gradients on the correlations. The dash indicates the ensemble mean of ESM-specific correlations. Pattern correlations for ΔTr (negative for all models) are presented with reversed sign to ease visualization.
As such, the BGC ensemble disagrees on the location of joint transpiration and runoff changes over 87% of land areas. Yet individual models consistently simulate runoff increases and decreases with about equal likelihood, irrespective of the sign of transpiration changes. We find that these runoff increases or decreases from CO2 physiological effects are largely explained by precipitation changes within each simulation (Fig. 2c–f). Regardless of the sign or magnitude of the transpiration response in the BGC simulations, the principal difference in water budget between runoff-increasing and runoff-decreasing areas is the sign of the precipitation change (Fig. 2c,d), even if inter-model spread in water budget changes is large. For regions in which transpiration declines, the magnitude of the decline is similar across both runoff increasing and decreasing regions, implying that the magnitude of transpiration declines does not determine either the magnitude or sign of runoff change. In contrast, precipitation increases (−5 to 85 mm yr−1, ensemble range) where runoff increases, and where precipitation decreases (−10 to −100 mm yr−1), so too does runoff. The sign of runoff change similarly conforms with the sign of precipitation change where transpiration increases, but with smaller magnitudes of water budget changes (Fig. 2e,f).
Spatial pattern correlations among hydrologic variables further clarify the relative importance of transpiration versus precipitation changes for determining runoff responses in the BGC experiment. The strength of these correlations indicates the spatial dependence of runoff responses on changes in precipitation minus evapotranspiration (ΔP − ΔET), precipitation minus transpiration (ΔP − ΔTr), precipitation and transpiration. ΔP − ΔET, a typical diagnostic of climatological runoff response, is quite strongly correlated with prognostic runoff change (ensemble mean correlation of 0.82), reflecting surface water budget closure within the models30 (dark blue points; Fig. 2g). Pattern correlation between runoff change and ΔP − ΔTr is nearly as strong (ensemble mean of 0.66, light blue points; Fig. 2g), reflecting the dominant contribution of transpiration to total evapotranspiration in non-barren areas34,35,36. However, whereas runoff change depends nearly as strongly on precipitation change as ΔP − ΔTr (ensemble mean of 0.53), transpiration change itself is very weakly correlated with runoff change (ensemble mean of 0.23, purple and green points; Fig. 2g). This tendency is reflected across the BGC ensemble: the runoff–precipitation correlation exceeds 0.5 in nine of 12 models whereas the runoff–transpiration correlation falls below 0.5 in all 12 models. The spatial variation in runoff change explained by precipitation change exceeds that explained by transpiration change by over a factor of 5 (ensemble mean R2 of 0.28 vs 0.05). As such, even in an ensemble where only plants are responsive to CO2, there is little consilience on whether and where stomatal closure should lead to runoff change.
Moreover, we do not find that combined changes in bare-soil and leaf evaporation consistently explain the correspondence, or lack thereof, between transpiration and runoff changes in the BGC ensemble. Where transpiration decreases, evaporation change is insufficient to alter the sign of evapotranspiration change, even if it somewhat limits transpiration-driven evapotranspiration declines in some models (Fig. 2c,d and Extended Data Fig. 7). Models are generally split on the sign of evaporation change (Fig. 2c–f), reflecting considerable uncertainty in the responses of these evapotranspiration components to the physiological effects of high CO232,33, which may partly stem from uncertain soil moisture responses (Extended Data Fig. 8a). Nevertheless, the ultimate influence of evaporation changes on total evapotranspiration is moderate. We also do not find evidence for a causal effect of soil moisture changes on runoff. Soil moisture tends to respond in the same direction as runoff in response to transpiration changes (Extended Data Fig. 8a). However, this pattern probably reflects the fact that soil moisture and runoff responses simply follow the precipitation response in the BGC, without a clear path for direct enhancement of land hydrology from stomatal effects (Supplementary Information and Extended Data Fig. 8b).
Instead, these results indicate a leading role for precipitation in generating runoff responses, even in simulations where plants alone respond to CO2 change. Transpiration-driven changes in total evapotranspiration do modulate the degree to which precipitation change translates into runoff change. For example, where transpiration declines in the BGC experiments, runoff increases more than precipitation increases, or decreases by less than precipitation decreases5 (Fig. 2c,d). However, whereas transpiration reductions shift the water balance relatively towards runoff, absolute runoff declines are equally likely given a reduction in transpiration. The sign of runoff responses within individual models is largely explained by precipitation changes, but the sign and location of these precipitation responses are highly uncertain across models (Fig. 1b). Together, these results tell us that runoff uncertainties arise because of precipitation uncertainties under CO2 physiological effects. In other words, the discrepancy between transpiration and runoff changes (Fig. 1a,c) is explained by uncertainty in precipitation itself (Fig. 1b).
Interpreting vegetation-forced precipitation changes
The role of precipitation in determining runoff responses to CO2 physiological effects in individual models raises the question of whether precipitation change is itself an atmospheric signal forced by vegetation responses to CO27,8,9 or simply noise arising from simulated internal variability. Addressing this question is central to interpreting the importance of stomatal effects for future water availability. Where precipitation change in the BGC simulations reflects internal variability, precipitation-induced runoff changes would reflect noise rather than a meaningful signal of vegetation responses to CO2. But where precipitation change is itself a significant forced signal, precipitation-induced runoff change actually reflects influences of plant physiological changes on precipitation via land–atmosphere feedbacks. To answer this question, we estimate the significance of precipitation change in each BGC simulation relative to the distribution of each model’s 30-year mean precipitation anomalies in the pre-industrial control experiment (Methods). We interpret BGC precipitation responses as ‘plant forced’ where they are greater than the 95th percentile or less than the 5th percentile of the distribution of unforced 30-year precipitation variability, as sampled from each model’s pre-industrial simulations. This approach is akin to a two-tailed non-parametric significance test at α = 0.1.
The precipitation changes that strongly influence runoff under CO2 physiological effects (Fig. 2) reflect a highly uncertain mix of plant-forced responses and internal variability (Fig. 3). Across the BGC ensemble, precipitation change is significantly different from pre-industrial variability over 31–57% of land area (two-tailed test, p < 0.1, ensemble mean of 44%), with precipitation decreases about three times more extensive than increases (Fig. 3a and Extended Data Fig. 8). Yet, there remains little agreement across models on where these plant-forced precipitation changes should occur. We find no agreement on the location of plant-forced precipitation increases (Fig. 3b), despite their occurrence over 14% of land on average within any given model (Fig. 3a). Similarly, the ensemble simulates forced precipitation decreases over 30% of land on average but agrees on their location for only 7% of land areas, mainly in the Amazon and northwestern Eurasia (Fig. 3b and Fig. 1b). Plant-forced precipitation declines have been highlighted previously as a consistent response in the Amazon8,37. However, we find that model agreement on significant precipitation changes, but disagreement on sign, is even more widespread, highlighting model uncertainty in forced precipitation responses. Such BGC ensemble consilience on plant-forced precipitation change, but uncertain direction of change, occurs over 11% of land area, mainly in northern Eurasia, central Africa and South America (pink areas, Fig. 3b). Despite relatively widespread forced precipitation signals, precipitation change is not significant over a majority of land area in 10 out of 12 ESMs (Fig. 3a), with ensemble agreement on no plant-forced changes over 38% of land area (yellow areas, Fig. 3b). Nevertheless, most terrestrial regions exhibit disagreement on the significance and/or sign of precipitation change (55%, Fig. 3b and Extended Data Fig. 8).

a, Fraction of global land area within each ESM with significant decreases (orange), significant increases (blue) or insignificant change (yellow) in precipitation (ΔP) under CO2 physiological effects (BGC experiments). The vertical dotted line shows the ensemble mean fraction of land area with a significant precipitation response. Significance is defined as two-tailed p-value of precipitation change (pΔP) of less than 0.1 relative to the bootstrapped distribution of 30-year mean P variability in each ESM’s pre-industrial control simulation. b, Ensemble agreement on P-change significance (≥ 2/3 model agreement). Annotations denote fractions of land area with ensemble agreement on significance or lack thereof. Areas with ensemble agreement on significance but disagreement on sign are shown in pink, whereas areas with ensemble disagreement on significance are shown in dark grey. Barren areas are masked in light grey.
Although transpiration–runoff pattern correlations are generally weak in each BGC run (Fig. 2g), we find that they are strongly negatively related to transpiration–precipitation pattern correlations (Fig. 4). The transpiration–precipitation relationships are also generally weak (ensemble mean r = 0.27, vertical dotted line; Fig. 4a), probably because forced precipitation changes cover a minority of land area within most models and may not be collocated with (but occur downwind of) transpiration changes (Fig. 3). Nevertheless, the strong link between the two (r = −0.82, slope = −1.0, p = 0.001; Fig. 4a) indicates that the scope for direct transpiration influences on runoff via water savings is mechanistically limited in models with stronger precipitation responses to transpiration change. Further, runoff–precipitation dependence does not scale with either transpiration–runoff (p = 0.55; Fig. 4a, axis annotations) or transpiration–precipitation dependence (p = 0.41; Fig. 4a, axis annotations).

a, Spatial pattern correlation between water-year changes in transpiration (ΔTr) and precipitation (ΔP, x axis), ΔTr and runoff (ΔQ, y axis) and ΔQ and ΔP (colour axis), for each ESM in the BGC experiments (points). Multiple regression coefficients (slopes β and p values) are annotated for each axis, and the correlation (r) is indicated for the sole significant relationship (ΔTr,ΔQ versus ΔTr,ΔP, annotations at upper right). All hydrologic changes are expressed as a fraction of recent CO2 mean total P to minimize the influence of baseline P gradients on the correlations. The dashed grey lines along each axis and the colour bar indicate the ensemble means of ESM-specific correlations. Pattern correlations for ΔTr (negative for each model) are presented with reversed sign to ease visualization. b, Schematic illustrating the effects presented in a. Thicknesses of arrows and font weight of relationship sign annotations are proportional to correlation strength. The dotted arrow denotes ensemble scaling of relationships across the BGC ensemble. c–e, Dependence of pattern correlations between ΔTr and ΔP, ΔTr and ΔQ and ΔQ and ΔP on global mean transpiration fraction of evapotranspiration (T/ET) across the 12 ESMs, with correlation coefficients and two-sided p values of the ensuing relationships annotated at upper right.
Thus, under CO2 physiological effects alone, precipitation most strongly determines runoff responses independent of any direct path from transpiration declines (Fig. 4a,b). Meanwhile, transpiration only meaningfully affects runoff in models with especially weak atmospheric responses to transpiration (that is, models with low transpiration–precipitation pattern correlations; Fig. 4a,b). Moreover, we do not find that the transpiration fraction of total evapotranspiration (Tr/ET) explains the strength of relationships among transpiration, runoff and precipitation across the ensemble (Fig. 4c–e). This Tr/ET ratio has been previously suggested as a key control on the potential impact of vegetation changes on the water cycle in ESMs17. The model spread in global mean leaf area growth due to CO2 fertilization also does not explain these relationships, nor does it explain global mean transpiration changes or the fraction of global land area with declining transpiration (Extended Data Fig. 2). Instead, our results implicate model differences in precipitation sensitivity to CO2-driven transpiration changes (that is, an atmospheric path rather than a direct land-surface path) as the major source of model disagreement over CO2 physiological effects on runoff. That said, we do not find more similar hydrologic responses among models that share related general circulation models (Supplementary Information and Extended Data Fig. 10). This may be the case because common responses among more closely related models are diluted by internal precipitation variability or because land-surface schemes are very sensitive to model parameter choices38 (Fig. 3).
Given their strong dependence on precipitation changes, ESM runoff responses under CO2 physiological effects are most accurately interpreted relative to precipitation changes and their significance. To this end, we present a typology of six precipitation–runoff effects that characterizes the diverse hydrologic consequences of plant CO2 responses. For plant stomatal responses to unequivocally boost runoff in any one BGC experiment, runoff must increase more than precipitation change times the low-CO2 (or baseline) runoff efficiency (ΔP*; Fig. 5, dark blue bars; Methods). This case implies that plant responses to CO2 increase runoff efficiency, either by enhancing precipitation via land–atmosphere feedbacks or offsetting the drying from stochastic precipitation decreases. Where runoff decreases, but less so than ΔP*, plants partly limit precipitation-driven drying. This case qualifies as relative plant-driven wetting only where the precipitation-driven drying is purely stochastic (Fig. 5, light blue bars). Where precipitation declines are plant forced (Fig. 3), plant responses to higher CO2 instead diminish runoff: enhanced stomatal regulation reduces precipitation and only partly alleviates the ensuing runoff impacts (Fig. 5, hatched tan bars). Where runoff increases but less so than ΔP*, plants limit precipitation-driven wetting (Fig. 5, dark brown bars) and where runoff decreases but more so than ΔP*, plants amplify forced or unforced precipitation-driven drying (Fig. 5, hatched and unhatched light brown bars).

Fraction of global land area within each ESM falling into six cases of joint precipitation (P*) and runoff (Q) change under CO2 physiological forcing alone. P* is defined as precipitation change times the low-CO2 (baseline) runoff efficiency (Methods). Inequality notation for the six cases is qualitative: for instance, Q ↓ > P*↓ denotes runoff decreasing but more so (more negative) than P*. Dark brown outlines mark cases where vegetation physiological responses to CO2 reduce Q, whereas blue outlines mark the opposite. Grey hatching indicates significant forced P declines (one-tailed pΔP < 0.1 relative to the bootstrapped distribution of 30-year mean P variability in each ESM’s pre-industrial control simulation). The vertical brown line indicates ensemble mean of land area falling into cases where plants responses to CO2 reduce runoff.
We find that plant-induced runoff drying in response to CO2 physiological effects (brown outlined bars, Fig. 5) is as common across the BGC experiments as plant-induced wetting (blue outlined bars), covering 36–59% of land area (ensemble mean of 50%) and representing the dominant response in five out of 12 models (Fig. 5). Further, about half of this plant-induced runoff drying is attributable to forced precipitation declines (10–37% of land area, ensemble mean of 27%; hatched bars, Fig. 5). Thus, when runoff responses are accurately contextualized with their associated precipitation changes in the BGC, ESMs exhibit little preference for plant-induced wetting under CO2 physiological effects alone.
Conclusions and implications
Through an analytic approach that assesses model-by-model responses, we show that ESMs disagree on how vegetation physiological responses to higher CO2 affect runoff across most of the world (Figs. 1c,d and 2a,b). Within individual ESMs where physiological responses to CO2 are isolated, precipitation changes influence runoff more strongly than transpiration changes (Fig. 2g). These precipitation changes are often induced by vegetation responses themselves, but there is little consensus in the BGC ensemble on where these plant-forced precipitation changes occur (Figs. 1b and 3b). Our core conclusion is that widespread model disagreement on plant-induced runoff responses is largely attributable to uncertain precipitation responses in the models (Figs. 2c–g, 3 and 4). These findings implicate atmospheric responses to stomatal limitation, rather than direct land-surface response of precipitation partitioning, as the key mechanistic driver of uncertain vegetation effects on future freshwater availability in ESMs.
Runoff effects of plant adjustments to high CO2 as reflected in the BGC experiments are often characterized as having the potential to increase runoff or soil moisture efficiency, thereby offsetting (or reducing) the impacts of radiatively driven drying5,12,13 (Extended Data Fig. 5). However, by properly contextualizing plant-driven runoff changes relative to their associated precipitation changes, we reveal that vegetation responses to CO2 are as likely to reduce runoff as they are to increase it globally, absent any impacts from radiative warming (Fig. 5). Plant-driven runoff-reducing responses include both direct impingement from amplified water consumption (Figs. 2a and 5) and impacts of transpiration changes on the atmosphere that reduce runoff via precipitation (Fig. 3). Our findings illustrate that whether CO2 physiological effects offset radiatively driven drying depends far more on how stomatal responses shape precipitation than on direct surface water sparing. In sum, we find limited evidence for generalizable plant-induced runoff boosts relatively or absolutely.
The fully coupled ESM simulations that inform climate policy and serve as inputs to regional hydrologic and impacts models for stakeholders combine uncertainties related to CO2 radiative and physiological effects. Whereas real-world climate change involves both CO2 effects simultaneously, uncertainties related to each effect have different underlying causes and are most directly understood and reduced individually. A key limitation of our approach based on the BGC experiments is that we assess physiological effects near present-day temperatures when, in reality, they will occur along with warming (Extended Data Figs. 5 and 8d). However, the strength of our approach lies in using the idealized BGC experiments to diagnose hydrologic uncertainty due to CO2 physiological effects. In doing so, our results suggest that the CO2 radiative and physiological uncertainties share a key commonality in that they hinge on the atmospheric drivers of precipitation.
Global vegetation31 and its simulated response to higher CO27,9,19 have been shown to affect global precipitation patterns through moisture recycling39 and surface influences on atmospheric circulation8. In our study, Amazonia and northwestern Eurasia stand out as two regions with strong ensemble consensus on transpiration, precipitation and runoff declines from CO2 physiological effects. In the Amazon, these changes have been attributed to altered land–atmosphere interactions that alter the broader circulation. Reduced transpiration and ensuing increases in sensible heating dry and deepen the boundary layer in some ESMs, suppressing convective recycling of precipitation across the basin37,40. Our results confirm the robustness of these Amazonian hydrologic responses but also find similarly robust drying in northwestern Eurasia (Figs. 1–3), the mechanism of which requires further investigation. Beyond these regions, we find that CO2 physiological impacts on precipitation may be widespread, affecting up to half of global land within models but with little consilience on where these responses should occur across models (Fig. 3 and Extended Data Fig. 9)26. This lack of consilience may be due to differing magnitudes and positions of surface impacts on tropospheric humidity and convection in regions such as Southeast Asia and central Africa8 or differing interactions with regional circulation41.
In contrast to our findings, many studies into CO2 physiological effects on continental hydrology conclude that precipitation effects on runoff are minor compared to transpiration effects12,16,17,18. We show that within individual ESMs participating in BGC, precipitation change is on the order of runoff change (Fig. 2c–f) and reflects a mixture of forced responses and internal variability (Fig. 3a). However, these changes are not apparent in the ensemble mean outside the tropics (Extended Data Fig. 1a,d) as there is little agreement on the location of forced precipitation responses across models and stochastic precipitation variability tends towards zero when averaged across ensemble members (Fig. 3b and Extended Data Fig. 1). Beyond demonstrating an important role for precipitation, our results highlight the limitations of ensemble means for assessing mechanisms of and confidence in the hydrologic impacts of global change.
Responses of vegetation physiology to increasing CO2 remain uncertain in ESMs14,18, reflected here by the widely distributed magnitude and pattern of transpiration change across the ensemble (Fig. 2c–f and Extended Data Fig. 1). These inter-model vegetation uncertainties have been linked to differing model representations of stomatal physiology and soil hydrology14,42. Furthermore, the extent to which ESM-simulated transpiration changes4,43,44,45,46 and their influence on runoff14,29,47,48 are consistent with observational, experimental and theoretical understanding remains unclear. Debates persist as to whether reductions in transpiration with higher CO2 are ecophysiologically plausible, and ESMs generally do not yet simulate key processes such as plant mortality and hydraulics. Despite deep uncertainty regarding the fate of transpiration under rising CO2, our analysis shows that precipitation responses are as large and widely distributed as transpiration responses but explain the pattern of runoff change within ESMs five times more strongly (Fig. 2g)49.
Together, these findings suggest that the uncertainty in runoff responses to CO2 physiological effects may depend importantly on model differences in processes that govern atmospheric responses to land-surface changes. Culpable atmospheric processes may include boundary layer and convective parameterizations50,51, whose role in determining precipitation responses under CO2 physiological effects warrants deeper attention. Whereas we focus on collocated effects in this study, spatial moisture tracking methods may help shed light on the remote influence of transpiration on runoff changes as plants adjust to higher CO252. Further, we find a strong compensation between transpiration’s influence on precipitation versus runoff—in other words, transpiration plays a weaker role in runoff responses in models with stronger transpiration impacts on precipitation. This scaling further suggests that variability in atmospheric sensitivity to transpiration responses across the ensemble partly determines the ultimate runoff consequences of transpiration changes (Fig. 4a,b). Spatial differences in leaf area initialization, leaf area change and the magnitude of transpiration responses probably also influence the divergent precipitation responses to CO2 physiological effects across models (Extended Data Fig. 1).
Our results highlight the prevailing ensemble uncertainty and diverse nuanced cases of relative hydrologic changes under CO2 physiological effects, which further complicate adapting water management to climate change. Our analysis affirms the robustness of plant-induced runoff boosts only over small portions of southeastern China, the Maritime Continent, western central Africa and South America outside the Amazon. Although people in these regions endure insufficient or inequitable access to water, these generally humid tropical regions are not particularly at risk of long-term physical water scarcity53 (Fig. 1c–e). For most of the rest of the globe, including many hotspots of current and projected future water scarcity, the ensemble of ESMs provides little clarity on the impact of CO2 physiological forcing on runoff (Fig. 1c and Fig. 2b). In these regions, the assumption that plant responses to higher CO2 will robustly favour runoff enhancements could lead to maladaptive management of water resources. Instead, our results warrant a cautious approach to interpreting the role of vegetation physiology when planning for future hydrologic change.
Methods
Earth system model data
To assess ensemble robustness and the mechanism of runoff (Q) changes under CO2 physiological effects, we focus on the biogeochemistry experiment from the Coupled Climate–Carbon Cycle Model Intercomparison Project (C4MIP)27. These experiments were conducted as part of the Coupled Model Intercomparison Project, phase 6 (CMIP6)54. In this experiment (technical identifier ‘1pctCO2-bgc’, abbreviated here as BGC), rising atmospheric CO2 concentrations are only presented to the land-surface component of each Earth system model (ESM). Our main analysis presents only results from this BGC ensemble, but our focus on BGC is partly motivated by comparison to its radiative counterpart (‘1pctCO2-rad’, abbreviated here as RAD), in which the rise in CO2 is presented only to the atmospheric radiation component of each ESM, and the fully coupled ‘control’ simulation (‘1pctCO2’, abbreviated here as CTL, a standard CMIP6 diagnostic experiment), in which rising CO2 is presented to both the atmospheric and land-surface components. Although we note slight warming over land in BGC, we interpret this experiment as mainly reflective of physiological effects near present-day temperatures, as the warming is only around 0.25 °C (Extended Data Fig. 8d). In each simulation, atmospheric CO2 concentrations increase by 1% per year relative to pre-industrial levels for 140 simulation years, yielding a quadrupling of CO2. These simulations thus reflect transient rather than equilibrium responses to CO2. Whereas BGC is intended as a carbon cycle diagnostic experiment, simulated carbon cycle feedbacks do not affect atmospheric CO2 evolution, which is prescribed as a 1% per year increase. These C4MIP simulations were conducted as part of the Coupled Model Intercomparison Project, phase 6 (CMIP6).
We obtained ESM simulation data for transpiration (‘tran’, Tr), precipitation (‘pr’, P) and total surface and subsurface runoff (‘mrro’, Q) from the Earth System Grid Federation (https://esgf-node.llnl.gov/search/cmip6/) for the 12 out of 16 ESMs participating in C4MIP that had all required variables for all experiments (Supplementary Table 1). We also acquired leaf evaporation (‘evspsblveg’, EL), soil evaporation (‘evspsblsoi’, ES) and leaf area index (LAI) data for portions of the analysis. Finally, we downloaded the monthly precipitation variable from each model’s pre-industrial control simulation (‘piControl’, a standard CMIP6 control experiment). To aggregate the data to the water-year scale, which most accurately captures long-term water availability, we summed the monthly time-scale model outputs for each variable from October–September in the Northern Hemisphere and from July–June in the Southern Hemisphere (except for LAI, which was averaged).
Hydrologic changes and ensemble agreement
Changes in the hydrologic variables are calculated as the difference between the initial and final 30-year means of each simulation (denoted as Δ). These changes thus reflect long-term hydrologic responses to an approximate tripling of CO2 relative to late-twentieth-century levels (330 ppm). We calculate the changes at the original spatial resolution of each model before re-gridding them to a common 1.25° × 1° resolution (the native resolution of CESM2) using a conservative nearest-neighbour method. This method facilitates grid-cell-based comparisons across the ensemble while conserving the underlying data without necessitating parametric interpolation assumptions. For the remaining steps of the analysis, we omit non-vegetated areas by masking the gridded hydrologic change data for all grid cells where annual average LAI < 0.1 m2 m−2.
We apply a moderately stringent definition for ensemble agreement on changes in hydrologic variables, composed of two criteria. First, for a model grid cell to qualify either a positive or negative change in a given variable, the change must exceed at least one-quarter of a standard deviation ( ≥ σ/4) of interannual variability under recent atmospheric CO2 concentrations (that is, the first 30 years of the simulation). This tolerance about zero accounts for the fact that any grid cell with essentially no change for a given variable and model is unlikely to equal exactly zero and will thus misleadingly contribute to a false impression of ensemble agreement on change of a given sign. Second, for the ensemble to agree on changes for a given grid cell, at least 2/3 of models must agree on the sign of change (‘agree positive’ or ‘agree negative’). Changes that do not exceed the σ/4 level are classified as ‘low change’. The 2/3 model criterion is consistent with the lower threshold for an assessed outcome to be deemed ‘likely’ under the Intergovernmental Panel on Climate Change’s nomenclature55. For comparison, the equivalent threshold for ‘very likely’ is 90%, whereas outcomes falling between 1/3 and 2/3 are classified as ‘about as likely as not’. Because our definition minimally satisfies this standard for ‘likely’, we consider it to be moderately stringent and treat it as our foremost measure of ensemble agreement. We use the word ‘robust’ to refer to hydrologic changes where this two-part definition is satisfied.
The difference between precipitation and evapotranspiration, P-ET, and its change with forcing, Δ(P-ET), are widely used diagnostics of changes in water availability and runoff. However, Δ(P-ET) conflates changes in P and ET, both of which may respond to CO2 physiological effects. Further, precipitation and evapotranspiration changes may be bidirectionally causal and are represented by fundamentally different physical process code in ESMs. Because the goal of our analysis is to diagnose the origins of uncertainty in runoff changes as plants physiologically adjust to higher CO2, we generally avoid analysing P-ET changes and instead focus on the individual variables that the metric summarizes. This approach enables us to trace runoff response uncertainty differentially to transpiration, evaporation and precipitation changes and thereby shed light on culpable model mechanisms.
Mechanistic classifications and hydrologic change analysis
To investigate the mechanism of runoff responses in the BGC experiment, which reflects the response to CO2 physiological effects, we classify model grid cells into four cases of joint change in transpiration and runoff: concordant increases (transpiration and runoff increasing), concordant decreases (transpiration and runoff decreasing) or discordant change (transpiration increasing and runoff decreasing or transpiration decreasing and runoff increasing). Due to prevailing uncertainty in runoff projections, this classification serves as a diagnostic tool, rather than definitive prognostic regional runoff projections. We present the total areal extent of each joint change case for each model, accounting for the dependence of grid cell area on latitude, as a measure of the prevalence of the cases across models. Our criteria for ensemble agreement on this classification is identical to those used above for univariate hydrologic changes (at least 2/3 of models agreeing on the sign of changes of at least σ/4 in magnitude).
For each joint transpiration–runoff change case, we compute water budget changes within each ESM in mm by averaging across model grid cells classified into each case. We track these mean changes for precipitation (water inputs), runoff (freshwater availability), transpiration (vegetation physiological responses), the sum of EL and ES (as an estimate of total evaporation) and ET (here approximated as the sum of Tr, EL and ES). We focus on these variables as changes in soil moisture are a negligible component of long-term hydrologic change compared to changes in P and ET components30. Bare-soil and leaf evaporation are examined in aggregate to assess their combined potential to meaningfully counteract changes in transpiration and thereby influence runoff relative to precipitation. We estimate the ensemble mean of these water budget changes for each variable as the average of individual model changes, rather than the change in the ensemble mean of each variable, to avoid potentially misleading signal dilution. We further compute pattern Pearson correlation coefficients of ΔQ with ΔP − ΔET, ΔP − ΔTr, ΔP and ΔTr (correlation among changes across pooled non-barren land grid cells, within each ESM). To control for the strong dependence of the magnitude of these changes on baseline climatological wetness, all changes are expressed as a fraction of 30-year mean total precipitation under recent CO2. Pattern correlations afford a continuous and globally pooled quantification of statistical dependence of ΔQ on ΔP and ΔTr, complementing the categorical water budget analysis.
For our analysis of the dependence of ΔQ,ΔP correlation on ΔTr,ΔP and ΔTr,ΔQ correlation across the ensemble, we compute pattern correlations between ΔTr and ΔP as above and assess dependence using multiple regression of the form:
where (varepsilon) represents regression residuals, subscript m denotes the ensemble dimension of ESMs and r is pattern Pearson correlation. We also separately present the bivariate Pearson correlation and slope coefficient between ({{{r}}left(Delta {rm{Tr}},Delta {{P}}right)}_{m}) and ({{{r}}left(Delta {rm{Tr}},Delta {{Q}}right)}_{m}).
Significance of precipitation changes
To determine whether ΔP is probably a significant forced response to CO2 physiological forcing, we employ the pre-industrial control simulations for each ESM. These simulations are run for at least 500 simulation years at pre-industrial CO2, and we first compute annual anomalies by subtracting the mean P across the full unforced simulation. We then estimate the distribution of 30-year mean unforced P anomalies by bootstrap resampling 500 random samples of 30-year periods with replacement for each model. Finally, we define ΔP as a significant forced increase if it exceeds the 25th largest positive anomaly and as significant forced decrease if it falls below the 25th largest negative anomaly. Otherwise, ΔP is not significant and unlikely to be a forced response. This approach is essentially a non-parametric test of difference in means at the two-tailed α = 0.1 and affords a more internally consistent and robust test than assessing ΔP relative to interannual variability in the initial 30 simulations years. We assess ensemble agreement on the significance of ΔP following the same 2/3 of ESMs criterion. Because we conduct this test on individual grid cells, significant forced signals may be driven by remote transpiration changes, and this remote influence may be assessed in future studies using spatial methods such as Lagrangian particle tracking52.
To characterize model runoff responses relative to precipitation change, we first derive a benchmark measure of rainfall change accounting for runoff efficiency (Q/P). Runoff change can be expressed as:
in which subscripts 0 denote means of the first 30 simulation years (‘baseline’) and (Delta) is defined as above. We interpret the first term on the right-hand side as expected rainfall impacts on runoff without changes in runoff efficiency (Q/P), which we denote ΔP*. The remaining right-hand-side terms include changes in runoff efficiency, reflecting the changes in surface precipitation partitioning in response to CO2 physiological effects. Thus, comparing runoff changes to ΔP* instead of absolute precipitation change accounts for variable baseline runoff efficiency in space and across models, while isolating the portion of runoff change attributable to precipitation change alone. For instance, if runoff increases more than ΔP* increases, then runoff is clearly enhanced by plant-induced changes in runoff efficiency.
Using ΔP* as a more meaningful benchmark than precipitation change, we classify model grid cells into six cases. For this classification, we also relax the σ/4 threshold for change to ease interpretation and ensure comparability of results across models. The first cases a priori involve runoff boosts by plant physiological responses to CO2: runoff increasing more than ΔP*, implying plant-induced runoff efficiency gains. Following the same logic, we also interpret runoff declining but less so than ΔP* as plants relatively boosting runoff, so long as the precipitation decline is unforced (second case). Otherwise, plants only partially limit the runoff effects of a precipitation decline induced by stomatal responses, which we classify as plants reducing runoff (third case). Plants also reduce runoff where runoff declines more than ΔP* declines, whether precipitation responses are forced (fourth case) or unforced (fifth case). Here we assess the significance of precipitation change as described above, except applying a one-tailed α = 0.1 to account for the directionality of the hypothesis within these cases. Finally, plants reduce runoff where runoff increases, but less so than ΔP*, suggesting decreased runoff efficiency as plants adjust to high CO2. We present the results of this classification as fractional areal extents per ESM, as in other aspects of our analysis.
Responses