Scale-dependent cloud enhancement from land restoration in West African drylands
Introduction
Similar to other drylands, the Sahel and West Sudanian savanna regions in West Africa are particularly vulnerable to the effects of land degradation and climate change. Although the main cause remains debated, high rainfall variability, water scarcity and anthropogenic pressures can drive further degradation in the region1. Ambitious restoration projects like the Great Green Wall initiative2 or the African Forest Landscape Restoration Initiative (AFR100) try to combat degradation, increase biodiversity or enhance carbon sequestration, while improving the livelihood of the local population3. Such restoration practices can include, for example, active tree planting, natural regeneration, farmer-managed natural regeneration and area protection4.
Because land restoration may cause changes in vegetation cover, it affects not only biogeochemical processes but also the biophysical properties of the Earth’s surface (e.g. albedo and surface roughness)5,6, especially in regions of strong land-atmosphere coupling such as West Africa7,8. As a result, restoration has the potential to alter the surface energy balance, land-atmosphere interactions and water availability9. Several projects have proposed to utilise these so-called biophysical climate effects to regulate the microclimate by decreasing temperature or enhancing soil moisture10,11,12,13,14 to improve human well-being15 or to provide adaptation benefits16. Yet, projects may alter properties such as turbulent fluxes and evapotranspiration through vegetation changes, which gives rise to the question of whether they can also change boundary layer properties and cloud development in the restored area or elsewhere. This is especially relevant in the context of land restoration in dryland regions, as increased cloud development may also affect precipitation and water availability. Combined with the uncertainty about the ecological and socioeconomic benefits of land restoration in dryland regions17,18 and the increased project implementation over the years19, this highlights the need to make accurate predictions on the expected changes in cloud development (as a first step towards precipitation) due to land restoration.
Predicting the net effect of changes in vegetation on cloud development in a certain area is not trivial due to the different mechanisms at play. Clouds can form when the boundary layer grows and the moisture in the atmosphere is lifted to the lifting condensation level. Vegetation affects this process by increasing atmospheric moisture through enhanced evapotranspiration, and by altering boundary layer growth through changes in albedo and available radiation, surface roughness and allocation of the available radiation to the latent and sensible heat flux. In addition, certain degrees of vegetation cover may increase water availability by increasing soil infiltration and reducing overland flow20,21,22. Previous research suggests that it is more likely that clouds are enhanced over forests when the sensible heat flux is larger than surrounding regions23, caused by a growing boundary layer. This mechanism has been observed in, for example, case studies in France24 and the United States25. In the Amazon, deforestation is suggested to increase shallow cloud formation, although this is likely to be driven by mesoscale circulation (the so-called forest breeze) rather than by increased convection26,27. This effect is especially strong with large vegetation heterogeneities, with increased precipitation on the non-forested side of sharp vegetation boundaries28. Unlike in the Amazon, where it is observed that extensive deforestation can again inhibit cloud formation29, studies in moist tropical West Africa have shown enhanced convective initiation over larger patches of deforestation caused by mesoscale circulation30. In addition, precipitation is enhanced over negative (dry) soil moisture anomalies in the Sahel31, especially on boundaries of dry and wet patches32.
Due to the different mechanisms proposed in previous research, it remains unclear how vegetation cover, and heterogeneities therein, affect cloud formation in dryland West Africa, despite these numerous studies on vegetation-cloud interactions. In addition, it is currently unknown how large a restoration project must be to affect cloud cover and precipitation, both in general and in West Africa33. Although recent developments in convection-permitting models now allow studying the land-atmosphere interactions at a higher resolution34,35, observational evidence for these interactions is needed to validate model results. These observations usually come from satellite data, providing several decades of cloud cover and vegetation data. Instruments such as the Moderate Resolution Imaging Spectroradiometer (MODIS) can be used for global vegetation-cloud interaction studies at a high spatial resolution23,36. However, the daily time scale of these higher-resolution datasets provides limited information on cloud development throughout the day.
An alternative, at least over Africa, is provided by the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) on board the Meteosat Second Generation (MSG) satellites37. Due to its geostationary orbit, MSG provides data on a 15-min temporal resolution, allowing for the analysis of vegetation-cloud relationships on diurnal, seasonal and multiyear time scales. As the 3 km spatial resolution of the conventional MSG cloud products is relatively coarse compared to the size of most restoration projects, we apply a data-driven cloud detection algorithm, originally developed for a case study in France24, and similar to38, to detect cloud cover from the High-Resolution Visible (HRV) broadband channel with a 1 km resolution. This methodology is applied to a case study region in West Africa of ~300 × 1300 km (10–13°N,0–12°E), containing regions within Nigeria, Niger, Benin, Burkina Faso, Togo and Ghana (Supplementary Fig. 1). This area is selected to meet computational limitations of processing the 321,200 HRV images available between 2004 and 2024, as well as to fall within the shifting scan modes of HRV SEVIRI for most of the day (07:00–15:45 UTC). This high number of images allows the detection of vegetation-cloud relationships, even if only a limited number of days of land-atmosphere coupling appears to exist.
We compare the high-resolution (1 km) cloud cover fraction to (1) spatiotemporal patterns in vegetation greenness and (2) areas within the World Database on Protected Areas (WDPA)39. As a complete database of the extent of land restoration projects is lacking, we combine the analysis of vegetation greenness with area protection, to provide a proxy for the expected effects of land restoration. In addition, the size of the WDPA regions and two subregions with different spatial patterns in vegetation greenness are used to determine the scale-dependence of the vegetation-cloud relationships. As a last step, locations of convective initiations based on MSG cloud-top temperature data (following ref. 40) are used to study the relationship between land restoration and the initiation of deep convection. Hereby, we aim to determine to what extent land restoration can enhance cloud cover in West Africa, offering relevant insights into the biophysical benefits of land restoration in West Africa, as well as providing observational support for policymaking and planning of land restoration projects across dryland regions.
Results
Robustness of HRV cloud detection
The used algorithm classifies each HRV image into either cloud or clear, hereafter referred to as the HRV cloud mask, and it is validated against two other cloud products. The Vertical Feature Mask from the Cloud-Aerosol Lidar and Infrared Pathfinder Observations (CALIPSO) instrument41,42,43 uses active lidar to provide vertical transects of cloud types. Comparing these cloud types with the HRV cloud mask at the corresponding scanning lines shows a high similarity of the cloud mask to the location of opaque or thick clouds such as cumulus and altostratus (Fig. 1a–c), with an overall accuracy (i.e. the fraction of correct classifications) of 80.0%. HRV clouds are classified as opaque cloud by CALIPSO with a success ratio of 89.2% and a false alarm ratio of 10.8%. Thin or transparent cloud layers like cirrus and altocumulus are often not detected in the HRV cloud mask (Supplementary Tables 1 and 2). This is mainly a result of the threshold set in the HRV algorithm, where transparent clouds were preferably ignored, as they are not assumed to be affected by vegetation but rather by atmospheric conditions. In addition, only 34.0% of the CALIPSO opaque clouds are classified as clouded by HRV, often when the clouds are classified as opaque but visually appear relatively thin (Supplementary Fig. 2). Combined with the high success ratio, this suggests that HRV generally has a higher threshold for cloud detection than CALIPSO.

The validation is based on three CALIPSO Vertical Feature Mask (VFM) cloud type profiles on 10/03/2015, 10/07/2007 and 29/06/2009 (a–c). The colours indicate different cloud types, where 1 = low overcast (transparent), 2 = low overcast (opaque), 3 = transition stratocumulus, 4 = low, broken cumulus, 5 = altocumulus (transparent), 6 = altostratus (opaque), 7 = cirrus (transparent) and 8 = deep convective (opaque). The white feature in the background indicates the cloud fraction according to the High Resolution Visible (HRV) algorithm within the scanning line on the same days at 13:30. The corresponding scanning line (red lines), HRV cloud mask (red shading) and HRV image are shown in (d–f). The standard 3 km MSG cloud mask (CLMK) (red shading) is shown in (g–i). The overall cloud cover frequency (CCF) is computed with the HRV cloud mask (j) and MSG cloud mask (k), based on roughly 321,200 images. Note that the longitude and latitude are switched in (d–i) to allow comparison to (a–c).
In addition, the HRV cloud mask is compared to the standard cloud mask provided by MSG44, hereafter called the MSG cloud mask. The MSG cloud mask is computed using a threshold method based on most of the MSG channels, including thermal infra-red data, and has a 3 km spatial resolution. The MSG cloud mask generally shows clouds at a similar location as the HRV cloud mask, although it often includes more transparent clouds (Fig. 1d–i). The overall cloud cover frequency (CCF), defined as the frequency of cloud cover occurrence at a certain location, calculated with the HRV cloud mask shows similar spatial patterns as the cloud cover frequency obtained from the MSG cloud mask (Fig. 1j, k). Both show a clear north-south gradient in cloud cover corresponding to the increasing aridity in the northern part of the study area. In addition, both products show enhanced cloud occurrence at orographic regions such as the Atakora Mountains (Benin) and the Jos Plateau (Nigeria). Although the overall cloud cover frequency computed with the HRV cloud mask is generally lower (typically 67%) than if the frequency is calculated with the MSG cloud mask, the spatial pattern is similar, with a coefficient of determination (r2) of 0.87 (Supplementary Fig. 3). This is possibly not only due to the lower threshold for cloud detection of the MSG cloud mask, as the MSG cloud mask detects both transparent and opaque clouds, but also caused by the lower spatial resolution, as mixed pixels result in a higher detection with a lower resolution.
Spatial and diurnal variations of vegetation-cloud relationships
Because we expect that land restoration will result in an increased amount of green vegetation45, we first study the relationship between vegetation greenness, represented by the Normalized Difference Vegetation Index (NDVI)46, and cloud occurrence, obtained from the HRV images. We focus on two subregions within the study area to study vegetation-cloud relationships in more detail. Selecting smaller subregions reduces the influence of variations in aridity and elevation within the study area (Supplementary Fig. 1). Subregion I is located on the border of Benin, Niger, and Burkina Faso, on the northern edge of the transnational W-Arly-Pendjari Complex (12.2–12.7°N, 1.8–3.3°E), one of the largest protected areas in West Africa. The vegetation inside the protected area mainly consists of natural vegetation such as grasslands, savannah shrublands and gallery forests, while the regions outside the protected areas contain an increasing amount of cropland. The climate is dry, with a wet season from July to September47. The total annual precipitation ranges between 654 mm in the north to 792 mm in the south, with an average of 728 mm, based on CHIRPS data48. Previous research has shown that vegetation heterogeneities can affect boundary layer dynamics and cloud formation in this region49. Subregion II is located in Nigeria and contains several small protected areas (10.8–11.3°N, 10.0–11.5°E). This region mainly consists of Sudanian savanna vegetation and cropland. Subregion II receives slightly more precipitation than Subregion I (between 767 and 1073 mm per year, with an average of 848 mm). Both subregions are roughly 55 × 160 km in size and contain sharp vegetation boundaries. However, the areas differ in the scale and heterogeneity of the vegetated areas (Fig. 2a), providing information on the scale-dependence of vegetation-cloud relationships. To study the effect of vegetation on cloud occurrence, we divide both subregions into green areas, with a mean NDVI higher than 0.38, and less green areas, with a mean NDVI smaller than 0.38. These boundaries roughly correspond to the land cover boundaries. In addition, we distinguish between CCF, where the cloud cover is averaged over time, and fractional cloud cover (FCC), where the cloud cover is averaged over space.

Yearly mean Normalized Difference Vegetation Index (NDVI) between 2004/01/01 and 2024/01/01 in the study region (a). MSG HRV snapshots for subregion I (b–e) and subregion II (f–i) as indicated by the red boxes in (a). Animations of cloud development for the corresponding days are included in Supplementary Movies 1–8. The images are selected based on time steps that show a high difference in cloud cover fraction between regions with a high and low vegetation. They serve as illustration of days with a high connection between vegetation and cloud cover.
Starting with a visual inspection of the original HRV images in Subregion I, multiple days exist that illustrate an apparent connection between cloud occurrence and vegetation greenness, with pronounced cloud development over the green area of the W-Arly-Pendjari Complex (Fig. 2b–e). Similar, but less pronounced results can be seen in Subregion II (Fig. 2f–i). During these days, the clouds often develop at the beginning of the afternoon around the edge of the green region, and quickly move to the less green areas (Supplementary Movies 1–8). Later in the afternoon, clouds are also present above the less green regions, lowering the clear difference in cloud cover between the areas with low and high vegetation greenness. Days with higher cloud cover over the less green areas are also observed (Supplementary Fig. 4).
Considering the whole 20-year period, the difference in fractional cloud cover between the green and less green regions (ΔFCC) is often small and positive, suggesting there is a tendency for more clouds above the green area. In Subregion I, ΔFCC is positive for 30.1% of the time, and negative for 20.1% of the time. For 49.7% of the time, there is no difference in cloud cover between the green and less green areas, which usually occurs during times when clouds are absent. In Subregion II, the results are similar to Subregion I, with a positive and negative ΔFCC for, respectively, 31.3% and 21.4% of the time (Supplementary Fig. 5). High positive values of ΔFCC, when the clouds are mostly present over the green area, appear mainly between July and September, in the afternoon and early morning (Supplementary Fig. 6). Interestingly, in Subregion II, the number of positive values of ΔFCC decreases in the late afternoon, while the occurrence of negative values of ΔFCC increases.
Over the whole study period (2004–2024) the cloud cover frequency (CCF) is higher over the green areas, with a positive spatial correlation (r2) between NDVI and CCF for both Subregion I (0.69) and Subregion II (0.46). Including all months and hours of the day, the absolute (relative) difference between the green and less green areas is 0.01 (8%) in Subregion I and 0.01 (5%) in Subregion II. The difference in CCF is significant (p < 0.05) over all months, but is largest (in absolute terms) during the wetter and cloudier months (April-September) (Fig. 3a–f). The small lower difference in cloud cover in the dry months is enhanced by the large number of cloud-free days. The relative difference in CCF, however, is highest in April and October. Between April and September, the overall mean CCF in the green area of Subregion I is 0.03 higher than outside the green area, a relative difference of 16%. Both Subregions show this enhanced CCF over green areas during these months, although the difference is higher in Subregion I (0.03, 16%) than in Subregion II (0.02, 9%), even though the difference in mean NDVI is similar (0.09 in Subregion I and 0.10 in Subregion II). The higher CCF is consistent (and statistically significant) over the day, but generally more pronounced in the early afternoon (Fig. 3g, h). Interestingly, the CCF is especially high in the early morning and decreases towards noon. A similar diurnal trend is seen in the CCF calculated with the 3 km MSG cloud mask (Supplementary Fig. 7) and has been identified as nocturnal low-level stratus clouds that persist throughout the following day50,51,52.

April-September 12:00–15:45 UTC cloud cover frequency (CCF) in subregion I (a) and subregion II (b), based on ~58,560 individual HRV images. The subregions differ in the spatial scale of the green areas. The seasonal (e, f) and diurnal (g, h) evolution in CCF (boxes) and Normalized Difference Vegetation Index (NDVI) (lines) between areas with a high NDVI (green line, red boxes) and low NDVI (grey line and boxes) are shown for subregion I (c) and a subregion II (d). The high NDVI regions are defined as having a yearly mean NDVI higher than 0.38. The dashed line shows the mean relative difference in CCF (ΔCCF). Seasonal CCF variations are calculated for 12:00–15:45 UTC, the diurnal variations for April–September. Boxes show the median (line), interquartile range (box) and 1.5 times the interquartile range (whiskers) of the data. Note that one map of CCF is calculated for each month (based on ~12,200 images) (e, f) or hour (based on ~14,640 images) (g, h) first and that the boxes only represent the spatial variation. Stars (*) on the x-axis in (e–h) indicate a significant difference in mean CCF between the high NDVI and low NDVI areas (p < 0.05, using the Mann–Whitney U test). All hours and seasons show a significant difference.
Scale-dependent cloud cover enhancement over protected areas
As a complete database of regions that have experienced land restoration is lacking for this region, we use the World Database of Protected Areas (WDPA) (Fig. 4a) as a substitute to study cloud enhancement from land restoration. Although we acknowledge that land restoration does include a wider range of practices than area protection alone, the protected areas show a consistent increase in NDVI over the past years that is higher than the areas that are not under protection (Supplementary Fig. 8). This is a result that is also expected under land restoration, which justifies the use of the WDPA data in this study. However, it should be noted that this does not necessarily imply that all the increases in NDVI are directly caused by land restoration or area protection, but also processes such as woody encroachment may contribute53.

Location of World Database of Protected Areas (WDPA) (opaque) and reference areas (semi-transparent) in the study region (transparent) (a). Difference in mean annual Normalized Difference Vegetation Index (NDVI) (b) and elevation (c) between the WDPA and reference areas. The difference in average April–September 07:00–15:45 cloud cover frequency (ΔCCF) between the WDPA and the corresponding reference area, per size of the WPDA area (d, e). The reference area consists of a 10 km buffer around the WDPA area, where overlapping WDPA areas are not considered. Points in (d) show ΔCCF for the individual areas. The line shows a linear regression between ΔCCF and log(WDPA size). The green line in (d) shows a linear least-squares regression, including the Pearson’s correlation coefficient (r2) and statistical significance (p) calculated with the Wald test. Boxes in (e) show the median (line/point), the interquartile range (box) and 1.5 times the interquartile range (whiskers) of the data grouped per 10-percentile of WDPA size. Each box contains 32 WDPA areas. The width of the boxes represents the range of WDPA sizes within the 10-percentile. Note that CCF is first averaged over time (based on ~131,760 individual images) and within the WDPA area and the reference areas, after which the ΔCCF is calculated. The boxes represent the variation in ΔCCF across WDPA areas only. Green boxes indicate that the median is significantly different to zero (p < 0.05, using the Wilcoxon signed-rank test). Results for a 5 km and 15 km buffer are respectively shown in Supplementary Figs. 10 and 11, illustrating a larger variation in ΔCCF with a larger buffer.
Across the study region, the protected areas have a generally higher NDVI than surrounding regions (Fig. 4b) and a slightly lower elevation (Fig. 4c). The April–September cloud cover frequency is enhanced over protected areas, although some of the smaller projects show a lower cloud cover frequency inside the protected areas than outside. On average, the April–September cloud cover frequency inside the protected areas is 0.02 (10.8%) higher than in surrounding areas. Interestingly, there is a significantly positive relationship (p = 0.002) between the project size and the degree of cloud cover enhancement (Fig. 4d), although the spread is large. The strongest enhancement of clouds is observed over larger protected areas, and the difference in cloud cover between the protected area and the surrounding area is significant for the 20% largest projects, with an area larger than 121 km2 (Fig. 4e). Although the level of spatial heterogeneity in itself affects cloud formation through mesoscale circulations, it should be noted that also the NDVI difference between the protected and reference areas increases with size (Supplementary Fig. 9), which may contribute to this size-dependent relationship.
Convective initiation
To further explore the potential of green areas to create these mesoscale circulations, we extend the analysis to consider the climatology of where deep convection is initiated. Triggering deep convection is an important component of land restoration in enhancing rainfall, as precipitation totals in this region are dominated by deep convective systems54. The locations of convective initiations are identified as rapidly cooling MSG pixels which reach a temperature threshold of −40 °C (see Methods)40. This results in 40,169 point locations of convective initiations between 10:00 and 16:30 UTC over the whole study area, corresponding to 11:00 and 17:30 in local time Nigeria, Niger and Benin. In Subregion I, there is a pronounced difference in the total number of convective initiations above green areas (with an NDVI higher than 0.38) than above less green areas (Fig. 5a, c, e). In Subregion II, convection is initiated above the larger green areas as well, although a considerable number of convective initiations occur at the boundaries of the smaller green areas at the centre of the subregion (Fig. 5b, d, f).

Mean annual Normalized Difference Vegetation Index (NDVI) (a, b), locations of individual convective initiations (c, d) and total number of convective initiations gridded with a resolution of 0.11 degree (~12.3 km) (e, f) between 2004 and 2023 in Subregion I (left) and Subregion II (right). Black lines indicate the contours where the mean NDVI is 0.38, to distinguish between high and low vegetation areas. The colours in (c, d) show the distance to this contour. g, h show the number of convective initiations per distance to the NDVI contour boundary, relative to the total area with the same distance to the boundary. The lighter shades of grey in (h) include points of convective initiation over regions with topographical differences higher than 250 m over a distance of 25 km in all directions. The histograms in (g) and (h) are based on 2389 and 3952 moments of convective initiation, respectively. Negative (positive) distances indicate that the NDVI is higher (lower) than 0.38. The data include convective initiations between 10:00 and 16:30 UTC over all months of the year.
To reconcile with previous research, we studied the distance of the convective initiation to the boundaries in vegetation greenness (where NDVI = 0.38). Both in Subregion I and Subregion II, the relative number of convective initiations in this dataset does decrease further from the boundary, at least on the less green side (Fig. 5g, h). On the green side of the boundary, the relative number of initiations is highest around 10 km from the boundary, but decreases towards the boundary and further inside the green area. It should be noted, however, that vegetation often co-varies with topography55. In Subregion II, for example, a number of convective initiations are located over regions that have both a high vegetation greenness and elevational differences, making it difficult to separate the effect of these variables on convective initiation. In Subregion I, topography is expected to have a limited effect on convective initiation due to the lower variations in elevation (Supplementary Fig. 12).
Discussion
In this study, data from the MSG High-Resolution Visible broadband channel is used to study the effect of land restoration on cloud formation in West Africa on a 1 km spatial resolution. Although the applied algorithm only uses information from the visible range of the spectrum, the results show a high similarity to both the standard MSG cloud product and opaque and thick cloud types derived from CALIPSO scanning lines. Zooming in to two subregions in the protected W-Arly-Penjari Complex (Subregion I) and smaller protected areas in northern Nigeria (Subregion II), we observe enhanced cloud formation above green areas with a high NDVI, especially between April and September. Although the absolute difference is highest in August, the relative difference is especially high, up to 25%, in April and October, just before and after the wet season. This is confirmed by visual evidence of days where the cloud cover shows a high spatial resemblance to the vegetation greenness. The increased influence of the surface properties on cloud formation at the beginning and end of the wet season has also been observed in previous studies32,56. During the dry season, the atmosphere is too dry for clouds to form, independent of surface conditions, while during the wet season, clouds form relatively easily over both green and less green areas57. In addition, during the core of the wet period in July and August, the connection between the surface and the atmosphere is expected to be slightly weaker compared to the other months in the wet season because evaporation is less limited by water availability58,59.
Several mechanisms could contribute to cloud enhancement or inhibition over green areas. A relatively low albedo and high surface roughness in the green area (Supplementary Fig. 13c, d) will increase the net radiation and the sensible and latent heat fluxes, consistent with estimated long-term averages of these fluxes provided by Land Surface Analysis data based on MSG SEVIRI60,61 (Supplementary Fig. 13e–j), which promotes boundary layer growth and cloud formation24,62,63. At the same time, the expected higher evapotranspiration within the green area27 can provide an extra input of atmospheric moisture and lower the lifting condensation level, favouring cloud formation over regions with high evaporation57,64 on the condition that the planetary boundary layer growth driven by the sensible heat flux is sufficiently large. On shorter time scales, variations in soil moisture also contribute to conditions with strong land-atmosphere coupling, both by proving a source for atmospheric moisture and affecting the height of the sensible heat flux. In West-Africa we often observe an enhanced cloud formation over regions with a negative soil moisture anomaly due to the enhanced sensible heat flux9,31.
Inhibition of clouds over green areas through heterogeneities in vegetation has been observed in other areas, when differences in turbulent fluxes and surface roughness between the green areas and their surroundings trigger convection through mesoscale circulation and convergence65. If the sensible heat flux is lower over green areas than the neighbouring less green areas (due to a higher share of net radiation going to the latent heat flux), a forest breeze develops where moist air from the green areas is lifted by the higher sensible heat flux above the less green areas27. Combined with convergence due to differences in surface roughness, these thermally driven mesoscale circulations have been shown to enhance convective initiation above deforested patches in closed-canopy tropical forests where the sensible heat flux is higher over deforested patches30,49. The strength and occurrence of these circulations depend on atmospheric conditions and the scale of the deforested patches66. Also in the Sahel, thermally-driven circulations have been shown to have a pronounced impact on convective initiation, with enhanced cloud development over areas with high sensible heat flux32.
Whether clouds are enhanced over green or less green areas depends on the relative contribution of the above processes. On a global scale, cloud enhancement over green areas is most likely when the green area has a higher sensible heat flux than neighbouring areas, and vice versa23. Also in this study, we observe a significant cloud cover enhancement over the largest protected areas, which could be caused by differences in sensible heat flux and spatial heterogeneity. In contrast to previous research, however, we find convective initiation mainly on the green edge of the boundary. Garcia-Carreras, et al.49, for example, showed through aircraft measurements over the southern edge of the W-Arly-Penjari Complex, a tendency for convection on the non-forested and warmer side of vegetation heterogeneity. Yet, as many studies suggest enhanced convection over the warm side of the boundary, conditions with a higher sensible heat flux over the greener area, either due to the low albedo or high surface roughness, could explain our results23 and the apparent discrepancies with previous research. Global data suggest that the average sensible heat flux is indeed higher over woody savanna and savanna regions than over grasslands and croplands67, although comparative measurements of surface fluxes between vegetation types in West Africa are limited. Regarding circulations induced by differences in surface roughness, convergence is expected to be largest on the upwind side of the green area32,68 on the south side of the W-Arly-Penjari Complex (Supplementary Fig. 13m, n). However, topography complicates the analysis of the link between vegetation and convective initiation in that region.
Although all the above mechanisms likely contribute to cloud formation to some degree, we are unable to quantify the relative contribution of these mechanisms from observations, due to a lack of reliable information on the effect of vegetation types and greenness on the sensible and latent heat fluxes in West Africa. More in-depth modelling studies or field measurements are needed to provide more insight into the precise mechanism of cloud enhancement in this study region during these specific days of cloud development because it remains uncertain how the sensible heat flux responds to changes in vegetation cover in West Africa or similar climate zones. Yet, also modelling may come with uncertainty in data-scarce regions.
Although this study is mainly a spatial comparison between regions with low and high vegetation greenness, the results suggest that land restoration can affect cloud formation in West Africa if the vegetation cover, or heterogeneity therein, is increased. However, as the differences in NDVI over space may be larger than the attainable increase in NDVI over time due to restoration, we expect the cloud cover effect of land restoration (i.e. a change in vegetation over time) to be smaller than the effects of spatial differences found in this study. Unfortunately, climate change and variability between years make analysing trends in cloud cover challenging. Running land restoration scenarios with weather or climate models is therefore needed to address these uncertainties.
It is estimated that mesoscale convective systems provide as much as 90% of the rainfall in the Sahel69. In addition, a considerable amount of rainfall in the Sahel (10–40%)70 and Africa (50%)71 originates from vegetation-based evaporation. Yet, more research is needed to determine to what extent the enhanced cloud formation results in deep convection and rainfall, within a region or elsewhere, and to what extent protected areas may have an effect on the total rainfall and water availability in a larger region9. On top of that, it remains unclear how the changing climate will affect the land-atmosphere feedbacks in West Africa in the future8,72,73,74. Yet, this research provides observational evidence that land restoration, especially larger projects, can impact cloud formation in dryland regions in West Africa, which is especially relevant given the current implementation of projects within this region and worldwide.
Methods
Input data
In this study, several datasets are used (Table 1). The main analysis is based on the Spinning Enhanced Visible and Infrared Imager (SEVIRI) instrument on board the Meteosat Second Generation (MSG) satellites. Due to MSG’s geostationary position, it has a relatively high temporal resolution of 15 min, allowing for both a diurnal and seasonal analysis of cloud development. The cloud detection algorithm is based on the broadband high-resolution visible (HRV) channel, providing a single reflectance value for 0.4–1.1 μm on 1 km spatial resolution and 15-min temporal resolution. Daytime images between 2004/01/19 and 2024/01/01, 06:00 UTC to 16:45 UTC (07:00 to 17:45 in local time Nigeria, Niger and Benin), were selected, resulting in roughly 321,200 images.
Two cloud products are used for validation. The first product is the standard MSG Cloud Mask44 with the same temporal resolution and period as the HRV product, but with a 3 km spatial resolution. Each pixel is classified as either clear sky over water, clear sky over land, cloud, or no data, based on a threshold algorithm using the visible, near-infrared and infrared channels. The other cloud product is derived from the Vertical Feature Mask (VFM) data product42 produced from Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument onboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite41,43.
Vegetation data is obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) product46, providing information on vegetation greenness. Although MODIS has a 250 m spatial resolution, we used the 1 km product here to match the spatial resolution of the HRV data. Shuttle Radar Topography Mission (SRTM) data75 is used as elevation data. Lastly, the World Database on Protected Areas polygons provide information on the location of protected areas within the study area39.
HRV cloud detection algorithm
For each 15-min time step, the HRV image is converted to a cloud mask using an algorithm based on Teuling et al.24. The basic principle of the algorithm is that clouds have a higher HRV reflectance (brighter) than the earth surface and reflectance values above a certain threshold are classified as cloud and below the threshold as clear sky. However, to correct for temporal variations in illumination across the study area, as well as mixed pixels and transparent clouds, the difference between the HRV reflectance and the clear-sky surface reflectance is compared to a threshold rather than the HRV reflectance itself. Thus, clouds are detected if:
where rHRV is the reflectance of the HRV image at each location x and 15-min timestep t, rcs is the clear-sky reflectance of the surface at a specific location (x), month (m) and hour of the day (h), and T is the threshold.
The clear-sky surface reflectance is computed separately for each hour of the day and time of the year. Each month is therefore divided into three 10-day periods, where the last period is 8–11 days, depending on the month. This results in 396 time slices (12 months, 3 slices per month, 11 h per day) for which the clear-sky reflectance is computed. Within this period, all images over the years are retrieved (~800 images), after which the smoothened empirical cumulative distribution function of each pixel is computed. The clear-sky surface reflectance corresponds to the reflectance value with the steepest slope (or most common/typical reflectance value). This method assumes that the clear sky reflectance corresponds to the most common or typical reflectance value rather than the lowest reflectance value to account for cloud shadows and variations in land cover. Because some areas in the south of the study region are mostly clouded between June and September, the typical reflectance maps are smoothened by taking the moving minimum value of five periods before and after each time step. The result is 396 images of typical HRV reflectance values, which are similar to the surface albedo (Supplementary Fig. 14).
Next, the threshold value is determined. As the overall reflectance of the image depends on the hour of the day, it is expected that the difference between cloud reflectance and clear-sky reflectance varies over the day. Therefore, the threshold T(h) depends on the average clear-sky reflectance at a certain hour of the day (h):
where s is a scaling parameter that can be calibrated. Higher values of s result in a lower number of clouds detected. To calibrate s, 264 random HRV images are selected (with 2 images for each month and hour of the day), for which the cloud masks with different values of s are computed. Visual comparison to the original HRV image is used to choose s in such a way that thick (e.g. cumulus) clouds are detected as clouds, but thin or transparent (e.g. cirrus) clouds are detected as clear. The selected value is validated to another 264 random HRV images. A s-value of 0.7 is used throughout this manuscript, but a comparison to values of 0.5 and 0.9 is included in Supplementary Fig. 15. Although using these values results in, respectively, higher and lower overall cloud cover frequencies, the spatial patterns remain fairly similar.
After determining the cloud threshold values, all 321,200 HRV images are converted to an HRV cloud mask, where each location is classified as either cloud or clear sky. Although images from 06:00 UTC to 16:45 UTC were initially obtained, visual inspection determined that the images between 06:00 and 06:45 and between 16:00 and 16:45 are too dark, with too little contrast between clouds and clear-sky reflectance, to accurately determine the cloud mask with this algorithm (Supplementary Fig. 16). For this reason, these images are not included in further analysis. The remaining cloud masks are used to calculate the CCF, which can be defined as the fraction of cloud occurrence over time, at a certain location. FCC is used for the cloud occurrence at a certain time step, averaged over space.
Validation of the HRV cloud mask
The algorithm is validated in two ways. To validate large-scale patterns and seasonal variations in cloud cover frequency, the HRV cloud mask is compared to the standard MSG cloud mask. This cloud mask is readily available for the same temporal resolutions, study period and study area as the HRV cloud mask created in this study, but on a 3 km resolution, and is derived from both visible and thermal infrared bands of SEVIRI MSG. The MSG cloud mask is visually compared to both separate scenes as well as the overall cloud cover frequency across the study area.
In addition, the created HRV cloud mask is compared to the VFM of CALIPSO. The VFM data describes vertical distributions of cloud and aerosol types along a scanning line. The scanning lines are located roughly between 1.5 and 2.5°E from South to North of the study area. The scanning lines have a return period of 16 days at around 13:30 UTC and are available from 2006 to 2023, resulting in 272 time steps. The cloud types at each scanning line are compared to the HRV cloud mask at 13:30 by calculating the cloud fraction of 20 grid cells around the scan line, accounting for potential differences between the observation times of CALIPSO and HRV.
Relationships to vegetation and protected areas
To evaluate the effect of land restoration on cloud formation, we compare the computed cloud masks to spatial and seasonal changes in vegetation. The study area is separated into green areas (with a mean annual NDVI higher than 0.38), and less green areas (with a mean annual NDVI lower than 0.38). This value is chosen to roughly correspond to land cover boundaries in the study area. In addition, we compare cloud cover frequency to locations of protected areas from WDPA. Before calculating the size of the protected areas, adjacent areas are merged. A reference area is created for each merged protected area by creating a 10 km buffer around the boundary of the protected areas. If the buffer overlaps with another protected area, this region is removed from the reference area. Next, the mean cloud cover frequency within each protected area and reference area is calculated, to determine the effect of area protection on cloud occurrence.
Relationships to convective initiation
Lastly, the vegetated areas are compared to the location of convective initiations, to determine if convection is more likely to occur over vegetated areas. Locations of convective initiation are obtained following the approach from Taylor40, with some minor adjustments. The MSG 10.8 μm channel is used to identify the emergence of pixels with a brightness temperature of −40 °C or less every 15 min. A minimum cooling rate for the coldest nearby pixel within a radius of 30 km is applied to images over the preceding hour to ensure that the initiation is due to a rapidly deepening cloud, and to remove cases where cold clouds propagate into an area. A cooling rate of 10 °C per h is applied to pixels within 30 km of the initiation is sufficient to create a large dataset of independent initiations. Regions with strong topography are determined as having an elevation difference larger than 250 m within a circle with a diameter of 50 km40 (a sensitivity analysis with elevation differences larger than 100 m is shown in Supplementary Fig. 17). No filtering is applied for large water bodies as they are not largely present in the studied region. This results in a list of points representing the location of convective initiation. For each point, the distance to the contour line, where the NDVI is equal to 0.38, is calculated. Because a larger area is in principle more likely to have a higher number of convective initiations than a smaller area, the total number of points within a certain 5 km bin is divided by the total area that is located within this distance bin. This accounts for potential differences in surface area between the distance bins. This way, it can be determined if convection is more or less likely to initiate over greener areas, or at the boundary.
Responses