Changes in global terrestrial live biomass over the 21st century
Natural climate solutions (NCSs) have become a major focus of the climate mitigation goals recognized by the science community and a wide range of stakeholders and governments under the Paris Agreement (1, 2). The underlying justification of NCS is based on the fact that terrestrial ecosystems, especially forests, are shown to be a substantial and growing sink of atmospheric CO2 for decades (3, 4). Current estimates suggest that terrestrial ecosystems release 10 to 20% of the total global CO2 to the atmosphere and sequester about 30% annually (5–8), and both terms having large spatial and temporal uncertainties. These uncertainties are associated with the complexity of estimating two main land fluxes: first, the flux of carbon from land use, land-use change, and forestry (LULUCF) (∆CLuc) that is estimated to be a net source of carbon [+1.6 ± 0.7 petagrams carbon (PgC) year−1] to the atmosphere, and second, the flux of carbon driven by environmental effects on terrestrial ecosystems (∆CEnv) that is estimated to be a net carbon sink (−3.1 ± 1.2 PgC year−1) (9).
Many processes affect these carbon fluxes that occur over multiple spatial and temporal scales. In the absence of any anthropogenic disturbance, the net carbon budget of forests is due to a balance between the carbon uptake (e.g., photosynthesis, tree growth, and accumulation in soils) and carbon release (e.g., respiration of living biomass and decay of necromass) (10). However, carbon loss (source) and gain (sink) due to direct human (e.g., deforestation, degradation and wood harvesting, and secondary forest regeneration or afforestation) and indirect environmental (e.g., droughts, storms, changes from CO2 fertilization, and N deposition) factors change this balance globally and even more substantially regionally and temporally (11–13). Inventory data suggest that the carbon gain in the terrestrial biosphere is mainly in forest ecosystems, but the partitioning of fluxes between the northern ecosystems (14) and the tropics (15) remains uncertain. Process-based models that include the effect of the CO2 fertilization equally distribute the net carbon gain between tropical and northern ecosystems (8, 16). Interhemispheric atmospheric CO2 analyses indicate that the tropics is either carbon neutral or a small net source (17), while the northern ecosystems are an increasing net sink (3, 18–20). Resolving these discrepancies from multiple approaches requires reducing the uncertainty in estimates of carbon loss and gain regionally.
The Global Carbon Project (GCP) provides annual estimates of the ∆CLuc from three book-keeping models (9) and the ∆CEnv as a residual of a global carbon balance approach or from several dynamic global vegetation models (DGVMs) that simulate effects of environmental changes (9). These estimates are used widely by the science communities but still remain uncertain because of poor spatial information for capturing regional variations of carbon stocks and land-use change and the coupling of both fluxes (environmental factors influence ∆CLuc) (3, 21). The unresolved uncertainties point to the need for improved spatial and temporal greenhouse gas (GHG) inventory of terrestrial carbon (14, 21).
There are two approaches for GHG inventory, namely, the gain loss and the stock change (1). The gain-loss approach is often used in countries without national forest inventory (NFI) data and emissions, and removals are calculated by multiplying the areas of classes of land-use change (activity data) with the changes of carbon stocks for those classes (emission factors). The stock-change approach is adopted by countries with NFI and provides estimates of net fluxes from the difference of carbon stocks between two points in time. Both approaches must produce comparable results, with stock change having considerably less uncertainty (22). Remote sensing observations have substantially improved these approaches by providing spatial information on forest disturbance (23, 24) and improving local or regional carbon loss of live biomass (5, 25, 26). However, quantifying carbon gain is more difficult because of the slow recovery of forests and the lack of systematic biomass inventories in all vegetation types (27, 28). A recent study based on the gain-loss approach combined with maps of forest carbon stocks and activity data has provided estimates of temporally averaged global forest carbon emissions and removals from 2000 to 2019 (29). This approach shows notable improvements over previous estimates by integrating spatially explicit datasets but still lacks temporal fluxes and has large uncertainty in estimates of carbon removal.
Here, we present a spatial bottom-up framework to estimate the annually resolved live biomass of global terrestrial ecosystems for this century (2000–2019) (fig. S1). The framework is based on a self-improving machine-learning (ML) algorithm that reduces estimation uncertainty of the stock change as inventory data on forest biomass from ground and remote sensing observations improve over time (see Materials and Methods). We used extensive forest inventory (>100,000 plots) distributed mostly in boreal and temperate regions, airborne laser scanning (ALS) data across global tropical forests (>1 million ha), and satellite lidar inventory of global vegetation height structure (>8 million sample footprints). The individual forest inventory and lidar measurements were converted to above and below-ground biomass from allometric models and used to estimate mean and variance of live biomass carbon density at an optimum grid cell of 100 km2 (10 km × 10 km). These grid cell biomass values were used as training data in the ML algorithm to estimate biomass carbon density from systematic time series observations from microwave and optical satellite imagery from 2000 to 2019 (see Materials and Methods). The training data were matched with the date of satellite observations to develop temporally consistent ML models that were then applied to the entire time series imagery. Following the stock-change approach, the difference of maps for every 2 years was used to calculate time series of stock changes (∆C = Cyear1 − Cyear2), which is considered negative when carbon stock increases (sink) and positive when it declines (source). Our approach only includes live biomass carbon stock (above and below) and ignores other carbon pools (soil, dead organic matter) but remains consistent with an Intergovernmental Panel on Climate Change (IPCC) Tier 3 approach by estimating carbon stock change through spatiotemporal biomass sampling. We lastly combine the activity data from satellite-based estimates to report annual emissions and removals globally.
Regional patterns of carbon stock changes
The carbon (C) stock in the world’s live biomass is estimated to be 381 ± 2 PgC (averaged over 2000–2019) (Fig. 1) across all terrestrial ecosystems (forests and nonforests). The uncertainty represents one SE and includes pixel-level prediction uncertainty, spatial autocorrelation (30), and the modeling uncertainty. The uncertainty increases to approximately 10% of the mean (30 to 40 PgC) when including errors associated with the reference ground data used in developing the models (see Materials and Methods). Total carbon stocks varied about 8 PgC (377 to 385) globally (relatively ~2% of total) from 2000 to 2019 but with larger relative variations at local and regional scales from contemporaneous disturbance and recovery processes. The largest year-to-year changes relative to the long-term mean, represented by the coefficient of variation (CV), are in regions of low biomass density (≲100 Mg ha−1) where land-use activities and natural disturbances are frequent and widespread (Fig. 1B).
Globally, woody carbon stocks are increasing slowly with an average annual gain of 0.23 ± 0.09 PgC year−1 (Fig. 2). The long-term estimated net gain of carbon in live biomass from our method is larger than the results from inventory data (31) and within the range of variability of estimates from other methods including process-based models (16), regional carbon balance from accounting all emissions and uptakes (32–34), and top-down estimations (17, 20). Our estimates of carbon gain and loss across all vegetation types (fig. S2 and table S1) indicate that pixels with significantly net gains are 1.4 times more numerous than pixels with net losses. Regionally, net carbon gains dominate landscapes of western conifers and boreal forests of North America, tropical forests in Africa, including subtropical forests in eastern China, and the boreal forests of eastern Siberia (Fig. 2, A and B). Because of the slow-in-fast-out characteristic of the biomass carbon pool, the losses are instantaneous and can be estimated at smaller scales but gains, especially in intact forests, are slow and can only be detected on decadal time scales and at larger areas due to the pixel level biomass dynamics and the estimation uncertainty (8). Between 2000 and 2019, carbon accumulation in terrestrial ecosystems is largely reflected in the increase of the carbon density of the remaining forests rather than the total carbon storage (Table 1).
Among global ecosystems, moist tropical forests store the most biomass carbon (~154 ± 1 PgC), about 40% of the global total (Fig. 1, A and C, and table S2), and their 20-year carbon stock trend remains approximately neutral with a small net loss (0.05 ± 0.03 PgC year−1) (Fig. 2C). This trend is in agreement with inventory estimates of 0.07 PgC year−1 net loss in all carbon pools from 1990 to 2007 (14). Tropical moist forests in South America are a net carbon loss of 0.05 ± 0.02 PgC year−1 (Fig. 2B and fig. S3), losing carbon at a rate of about 0.6% per decade, likely attributable to deforestation, degradation, and recent droughts (35–37). About 18% of areas of intact forests in tropical Americas (77 million ha) are gaining carbon at a rate of 0.19 megagrams carbon (MgC) ha−1 year−1, and 20% (86 million ha) are losing carbon at a rate of 0.18 MgC ha−1 year−1; the remaining area has no significant trend. In contrast, African moist tropical forests are a net gain of carbon (0.02 ± 0.01 PgC year−1) with close to 40% of areas of intact forests showing a net gain at the rate of 0.40 MgC ha−1 year−1, and only 7% with a net loss with an average loss rate of 0.36 MgC ha−1 year−1. Similarly, in Asia, 25% of areas of intact forests accumulated carbon at a rate of 0.35 MgC ha−1 year−1, and 15% of areas showed net carbon loss at a rate of 0.34 MgC ha−1 year−1.
Tropical and subtropical dry forest and savanna shrublands (~54.5 ± 0.6 PgC) are accumulating carbon at a rate of 0.09 ± 0.04 PgC year−1, with relatively larger contributions from Asia and Africa (38). Similarly, woody vegetation in mixed grassland and cropland regions is extensive in area in each continent and despite small biomass density exhibit a net carbon gain of 0.05 ± 0.02 PgC year−1 (Fig. 2B). Increasing contributions of tropical dry forests and woodlands to global carbon balance trends and variability are consistent with top-down approaches (39) and ecosystem models (40).
Northern ecosystems show a net gain of carbon. Temperate (total carbon, ~64.4 ± 0.5 PgC) and boreal ecosystems (total carbon, ~41.3 ± 0.6 PgC) are gaining carbon at a net rate of 0.10 ± 0.03 and 0.04 ± 0.02 PgC year−1, respectively. Across all boreal regions (table S1), forests are accumulating carbon at much higher rate of 0.10 MgC ha−1 year−1 than the ecosystem scale (forest and nonforest), which is comparable to the rate reported from inventory data of about 0.12 MgC ha−1 year−1 (14). These rates also compare to estimates from national inventory data reported by Annex 1 countries at 5-year annual cycles to the United Nations Food and Agriculture Organization (fig. S4). Although net carbon stock changes from northern ecosystems are relatively small, our estimates show large interannual variability (IAV) that is missing from inventory estimates reported by Annex 1 countries (fig. S4).
We report estimates of carbon stock change for 169 countries and regions (larger than 1 million ha) (table S3). During the period of the study, China had the largest net carbon gain in live biomass with an average rate of 0.1 PgC year−1, mostly attributed to the newly established forests in southern China and greater sequestration in existing forest areas (41). The Russian Federation (0.05 PgC year−1), United States (0.04 PgC year−1), and Canada (0.03 PgC year−1) all had net gain of biomass. The European Union, on average, had an insignificant increase in live biomass (0.01 PgC year−1), partially due to increasing forest disturbance in this century (42). Among countries in tropical regions, Brazil and Indonesia had net carbon by 0.05 and 0.03 PgC year−1, respectively, but the countries in the Central Africa (Gabon, Democratic Republic of Congo, and Congo) gained carbon at an average rate of 0.02 PgC year−1 (fig. S5).
To put the net carbon stock changes in the context of the global carbon budget and terrestrial carbon emissions and removals, we estimated annual carbon loss of live biomass from disturbances attributed to forest cover change (e.g., forest clearing), tropical forest degradation, and fires in forests and nonforest ecosystems. Carbon loss from disturbance is estimated for the subgrid cells using the gain-loss approach by multiplying the annual percent area of disturbance by the mean biomass carbon density of forest or nonforest of the year before the disturbance. The carbon loss estimates were adjusted by the emission efficiency factor for each type of disturbance and corrected for the bias factor associated with the large grid cell (see Materials and Methods). The loss of carbon from disturbance may not be released to the atmosphere in the year of loss but delayed for years (e.g., timber products) or transferred to other pools (e.g., live to dead) (43). Nevertheless, here, we refer to the total carbon loss as the committed emissions from live biomass at the year of loss. The difference in accounting method and types of lagged effects not reflected in our estimates of emissions from live biomass may still be approximately correct because of the offset from emissions from disturbances initiated in previous years (21).
Emissions from forest clearing (EFC),estimated annually from Landsat time series (23), contributed 1.7 PgC year−1, on average, to the total carbon loss with an interannual variation of 0.7 PgC (Fig. 3A). Tropical regions contributed 39% of the total carbon loss, followed by 20% from boreal and another 19% from temperate regions. Carbon loss from forest clearing remained approximately constant (1.3 ± 0.2 PgC year−1) for the first decade but increased substantially after 2012, reaching a maximum of 3.1 PgC year−1 in 2017 (Fig. 3D). This trend is mostly due to increases in forest clearing detected by Landsat imagery that occurred in temperate and boreal ecosystems of Eurasia (fig. S6), deforestation in the West and Central Africa, the northwest and southeast regions of the United States, and tropical dry forests such as the Chaco in South America (23).
Emissions from fire (EFire) were estimated separately for forest and nonforest areas using the Moderate Resolution Imaging Spectroradiometer (MODIS) burned area product (44) that do not overlap with the forest clearing. Average global emissions from forest fires were 0.38 PgC year−1 with higher rate during the earlier period (2000–2009) at 0.5 PgC year−1 and declined by about 0.1 PgC year−1 from 2011 to 2019. Emissions from nonforest ecoregions were about 2.1 PgC year−1, with a larger decline from 2.2 PgC year−1 in 2001 to 2009 to 1.9 PgC year−1 for the decade after (2011–2019). Forest fire emissions were dominated by tropical and subtropical woodlands and shrublands (69%) followed by tropical and subtropical moist ecosystems globally (24%). Savanna fires in semi-arid and dry tropical ecosystems contributed 56% of the total fire emissions and were concentrated (59% of emissions) in African woodlands and savanna (38). Our analysis of Landsat forest cover change (30-m resolution) shows that the post-2012 increase in forest loss in tropical moist and dry ecosystems was not always associated with burned areas predicted from MODIS (fig. S6). Emissions from fires in temperate and boreal ecosystems in excess of those attributed to emissions from forest clearing were small (fig. S6, A and B). Estimates of the total fire emissions from our approach are comparable to the global fire emissions database of about 2.2 PgC year−1 averaged for the period of 1997 to 2016 (24). Estimated magnitude of emissions, regional patterns, and trends are also consistent with model-based estimates that are constrained by atmospheric carbon monoxide data (38).
Loss of biomass from forest degradation was considered only in moist and dry tropical forests. In the absence of reliable annual information on forest logging, our estimates included only available data on forest degradation near roads (including some logging roads) and edge effects from deforestation and forest fragmentation (45) and removing burned areas to avoid double counting (see Materials and Methods). Loss of biomass from moist tropical degradation added another 0.29 ± 0.06 PgC year−1 to the total emissions. By including tropical dry forests with widespread degradation from timber harvest and fuel wood extraction, the total emissions from degradation (EFD) increased to 0.41 ± 0.08 PgC year−1 (Fig. 3C).
The total committed emissions from summing all emissions from disturbances (
) was 4.63 ± 0.13 PgC year−1 from 2000 to 2019 (Fig. 4). We expect that this is largely attributable to LULUCF, with additional emissions from the environmental disturbances (e.g., droughts, fires, and storms) that may be mixed in satellite observations of forest clearing and fire (23, 24). Our estimate of total committed emission is 84 to 100% of total gross LULUCF emissions (including all carbon pools) from the GCP book-keeping models, ranging from 4.6 PgC year−1 averaged for 2000 to 2019 [updated from (46)] and 5.5 PgC year−1 averaged from 2009 to 2018 (21). Our approach may also underestimate gross emissions from environmental disturbance because of the spatial resolution (100 km2) of annual biomass maps, potentially offsetting gains and losses within single grid cells. Therefore, the total gross emissions
from our approach include emissions from satellite observed disturbances
plus a residual emission (Eδ) term to account for emissions from other carbon pools and all unaccounted disturbances such as drought-driven tree mortality, defaunation, and grazing (47).
Total carbon gain in live biomass of global terrestrial ecosystems (
) estimated from the difference between the net carbon stock change (∆C) and the total carbon loss (
) resulted to −4.86 ± 0.16 PgC year−1 (Fig. 4). At the grid cells,
may result in a positive value (loss) due to underestimation of
from ignoring Eδ and uncertainty in each term, but the aggregated estimate over large areas always leads to a net gain. As in the case of emissions, we allow gross carbon removals to be the gain from live biomass (
) plus a residual term to represent all unaccounted terms (e.g., soil, dead, and litter) and the potential underestimation due to large grid cells (
Regionally, variations of carbon fluxes show the strong contributions of live biomass changes from tropical moist and dry ecosystems globally (Fig. 4). These ecosystems have large emissions (3.18 ± 0.08 PgC year−1) and large removals (−3.22 ± 0.09 PgC year−1) driven by disturbances and recovery processes. African dry ecosystems captured the largest proportion of this contribution (1.58 ± 0.06 PgC year−1 emissions and −1.59 ± 0.06 PgC year−1 removals) (table S2), mostly due to large-scale and seasonal fire events from land-use activities and widespread woody encroachment and relatively rapid recovery (48). The predominance of tropical dry ecosystems to explain the global carbon uptake is consistent with other remote sensing–based estimates (49) and results from ensemble land surface models (39, 40).
Extratropical ecosystems in temperate and boreal regions together had the next largest carbon removal of −0.87 ± 0.04 PgC year−1 (table S2) (50). The small net gain of carbon in these ecosystems from our analysis is largely due to increased fire and forest clearing disturbances since 2010 (fig. S6). Estimates of carbon removal from our analysis (temperate, −0.52 PgC year−1; and boreal, −0.35 PgC year−1) are comparable to inventory estimates (14), DGVM models (16, 51), and estimates from other satellite observations (52) (Table 2).
For high biomass intact forests, where the uncertainty in biomass estimation may not allow detection of biomass gain, we estimated potential adjustments to the average long-term carbon removals of remaining intact forests using inventory data from different ecosystems (see Materials and Methods). By using the spatial uncertainty map (fig. S7), we identified grid cells with live biomass thresholds of 200 and 100 Mg ha−1 and uncertainty levels that does not allow reliable change detection. We adjusted the intact forest gains for these grid cells by using the area of intact forests at subpixels that resulted in increasing net global carbon gain from −0.23 to about −0.42 PgC year−1 and to −0.88 PgC year−1, respectively, for each biomass threshold. These adjustments only influenced the carbon removals in high biomass areas in moist tropical forests by switching it from neutral or small source (+0.05 PgC year−1) to a sink (−0.12 to −0.31 PgC year−1 for each threshold). The largest contribution of this adjustment was in tropical Americas and temperate forests that changed from −0.10 PgC year−1 to a larger net gain of −0.11 to −0.31 PgC year−1 for each biomass threshold (table S4).
IAV of emissions and removals
Carbon stock changes show large IAV of ±1.7 PgC year−1 globally with ±0.93 PgC year−1 in tropics and ± 1.29 PgC year−1 in extratropics. Note that the regional IAV of carbon fluxes do not follow the same year-to-year patterns and therefore is not additive at the IAV level. Overall, emissions from forest cover change and fire had relatively smaller IAV (±0.4 PgC year−1) (Fig. 5B), while carbon removals showed larger IAV (±1.8 PgC year−1) (Fig. 5C). On average, tropical moist, boreal, and temperate ecosystems had approximately equal contributions to the magnitude and IAV of carbon removals (±0.5 to ±0.7 PgC year−1) in this century. Carbon removals showed large IAV in global semi-arid regions (±0.8 PgC year−1) in agreement with prediction from DGVMs (51). The large IAV of carbon removals is dominated by the postdisturbance recovery of vegetation and biomass gain of intact forests, particularly in tropical regions where vegetation productivity is partially controlled by precipitation and radiation (fig. S8) (51). We found that IAV of emissions from disturbance lacked a strong relationship with the variability of climate except during extreme events (e.g., peatland fire), as these emissions are mostly due to human-induced disturbances.
A comparison of the IAV of stock changes with the national inventory of Annex 1 countries over the past two decades reveals that the variability of carbon stocks may be even larger than our reported numbers (fig. S4, D and E). The IAV of global carbon removal
from our analysis was almost by a factor of two larger than the land-atmosphere exchange estimated from DGVMs (51) and the IAV of atmospheric carbon sink (9). Small IAV of emissions from disturbance is independent of emission factors used in this study and is mostly due to the observations of forest cover change and fire (Fig. 5B). Similar IAV is also evident from land-use emissions derived from the book-keeping models (7). Land-use change and natural disturbances may have large year-to-year variability at the local and small scales, but this variability is in general small at regional and global scales because the occurrence of disturbance in one location is spatially decoupled from other locations, hence causing less variability at larger scales (51). In contrast, the large IAV of live biomass carbon removals (Fig. 5C) indicates that the postdisturbance recovery process and biomass changes of intact ecosystems at local scales may be correlated with similar processes at larger scales due to the influence of environmental factors (i.e., climate and edaphic conditions) on the vegetation carbon uptake.
Live biomass in global carbon budget
Average emissions and removals from our study are not directly comparable to the GCP estimates of land sinks and sources (7). Our approach is based on carbon stock change of live biomass and provides gross emissions and removals without separating the LULUCF and environmental fluxes, while the GCP carbon balance focuses only on net fluxes, separates LULUCF emissions, and includes all carbon pools. To improve comparability, we revised the GCP carbon balance equations into gross emissions and removals as
(1)where the E terms represent emissions from fossil fuel and fossil carbonate emissions (EFuel), gross emissions from LULUCF (ELuc), and from the environmental disturbance (EEnv). The S terms represent gross removals including atmospheric carbon sink in terms of CO2 growth (SAir), net ocean sink (SOcean), and gross removals from land-use change (SLuc) and environmental factors (SEnv). By using
, we replace the gross emissions and sinks from the land part with our estimates. We balance the equation by using 8.7 ∈ [6.9,10.0] PgC year−1 (number in brackets show minimum and maximum during 2001–2018) for EFuel, 2.4 ∈ [1.8,2.7] for SOcean, and 4.6 ∈ [3.3,6.3] PgC year−1 for SAir (7). For the land components, we use 4.6 ∈ [3.7, 5.8] PgC year−1 for
and −4.9 ∈ [−9.1, −2.5] PgC year−1 for
Balancing the budget will provide us with a residual carbon budget imbalance term (δim = Sδ − Eδ) of −1.4 ∈ [−3.8, 3.7] as a net sink of carbon. If we include adjustments for intact forest gain, then the live biomass carbon removal of −5.5 PgC year−1 will result in a budget imbalance of −0.8 PgC year−1. Our estimate of the budget imbalance is larger than GCP budget imbalance (−0.1 PgC year−1) (9) and can be attributed to fluxes associated with other carbon pools (soil, dead, and litter), lateral transport, burial of carbon in freshwater, and other legacy fluxes (Fig. 6).
Comparison with other global estimates
A fair comparison with most previous estimates is not straightforward. We compiled data from several studies that included both regional and global estimates of carbon loss (emission/source), gain (removal/sink), net change, and, if possible, fluxes of live biomass separated from other pools (Table 2). Here, we focus on results from two studies that are based on the forest inventory data (14) and a combination of satellite-based biomass and spatially explicit gain-loss approach (29) and refer to other studies for further clarification. Our 20-year (2000–2019) stock change of live biomass (−0.23 to 0.88 PgC year−1) is substantially larger than the 18-year (1990–2007) inventory-based estimate (+0.27 PgC year−1) [table 3 in (31)] and with a different sign. The average gain of live biomass from our approach compares to the inventory-based net carbon sink of global forests (−0.8 PgC year−1 adjusted for live biomass) (Table 2) that are largely due to the increase in biomass density of remaining intact forests in tropical and temperate regions (14). Apart from the difference in the period of the study and the definitions of gain and loss of carbon, unlike the inventory-based estimates showing net sink in the extratropical regions, our estimates suggest that net carbon sink is almost equally divided in tropical and extratropical regions when we include adjustments for intact forest gain.
Comparison of our estimates with the gain-loss approach for the same period (29) shows close agreement on the magnitude of gross carbon removals but differences in carbon emissions. At the global scale, our estimate of gross carbon removals (4.9 to 5.5 PgC year−1) from all woody vegetation (forest and nonforest) agrees with their estimate (4.36 PgC year−1) for forests (Table 2) (29). However, our estimate of global emissions of 4.6 PgC year−1 is almost twice the emissions from their approach of 2.2 PgC year−1. The difference in emissions is largely due to the inclusion of fire in areas of savanna and woodlands that do not fall in the definition of forests but includes woody vegetation. Harris et al. (29) also used additional information on potential drivers of forest cover change to apply different IPCC emission factors geographically to Landsat-derived forest loss, particularly in boreal and temperate regions, while we used the definition of Landsat forest loss as the total clearing of forest (23) systematically across all forest types and regions. There are also some differences in estimates of carbon removals in temperate forests that can be attributed to the large uncertainty of removal factors used in (29).
As estimates of total emissions from our study are similar in magnitude to recent estimates from the book-keeping models (21) and the net carbon flux of live biomass agrees better with the inventory data (14, 31), we consider that our estimates may be more consistent across biomes. The total carbon removal is also comparable to the sum of uptakes in intact forests of 2.6 PgC year−1 (averaged for 2000–2015) and secondary forests (0.9 to 1.7 PgC year−1) for the same period from a combination of inventory and DGVMs (Table 2) (51, 53). However, because our approaches are spatial in nature and use the same data layers for the forest disturbance, our estimates of gross emissions will converge with (29) if we use the same drivers of forest cover change and emission factors.
Comparison with other remote sensing–based studies suggests that our global and regional net carbon sinks, although smaller, but falls within the range of variability of other estimates (26, 52, 54). Unlike some remote sensing–based estimates that show tropical forests as a large net carbon source to the atmosphere (0.23 to 0.78 PgC year−1) (Table 2), our estimates indicate a small net sink from the stock change analysis. We may have underestimated emissions from degradation due to missing reliable data on selective logging across tropics. Nevertheless, the same problem may exist in other remote sensing approaches. Our average annual estimate of 0.41 PgC year−1 from tropical degradation is in close agreement with the inventory-based estimates from emissions from timber and wood fuel (0.46 PgC year−1) (55).
For boreal and temperate regions, we show a large carbon sink, but our estimate is smaller than other estimates (52). For temperate forests, our estimate of gain 0.18 MgC ha−1 year−1 of intact forests is substantially smaller than the inventory estimate of 0.45 MgC ha−1 year−1 for 2000–2007 (14) but comparable to recent estimates from DGVMs (16). We attribute this difference to notably more loss of carbon from forest cover change and fire in our analysis after 2010 (fig. S6) that is not reported in the inventory data (14) but detected from satellite observations (23, 24). This may be partially due to post-2012 disturbances in boreal and temperate regions of Eurasia that are included in our estimates and from the large uncertainty in detecting the dynamics of forest cover change and the drivers (56).
Over tropical forests, our estimates of carbon gain and loss in intact forests, for regions where our uncertainty is low, are comparable to carbon stock changes from undisturbed plot networks. In the Amazon Basin, our rate of loss of 0.18 MgC ha−1 year−1 is comparable to the rate of loss of 0.17 MgC ha−1 year−1 over 10 years (2000–2010) as a result of increasing tree mortality and decreasing productivity in the field plots (15). In contrast, our rate of gain of 0.19 MgC ha−1 year−1 is smaller than ground plots of 0.28 (0.07 to 0.48) MgC ha−1 year−1 over 20 years (2000–2019) (15, 57). Similar pattern is observed in Africa and Asia. The rate of carbon gain of 0.40 MgC ha−1 year−1 for intact forests in Africa and 0.35 MgC ha−1 year−1 rate for Asia are smaller than the net gain of 0.68 MgC ha−1 year−1 in Africa and 0.45 MgC ha−1 year−1 pan-tropically from a small number of research plots (~278 ha) (57). However, after adjustments for intact forest carbon gain, the average rate of carbon across tropics becomes comparable.
Uncertainty assessments and reductions
We have included multiple sources of errors associated with the remote sensing data analyses and modeling that have been already accounted for through a formal error propagation approach (see Materials and Methods). However, there are four additional sources of errors that may influence the estimates of emissions and removals. (i) The lack of sensitivity of remote sensing data to predict large biomass density forests. We addressed this problem by using large grid cells (~100 km2) for mapping live biomass globally and showed that the estimates at this resolution have low variance and low bias. At this resolution, the predictor variables from microwave and optical measurements had substantially better sensitivity to live biomass (fig. S7). (ii) Underestimation of emissions and removals from offsetting gains and losses in large grid cells. We circumvented this problem by estimating emissions using finer spatial resolution (30 m) forest cover change and fire (500 m) data and the mean of live biomass of grid cells after separating subpixel forest and nonforest biomass estimates. Any potential underestimation of emissions because of the difference in the resolution of carbon stocks and disturbance were adjusted using the systematic error estimate from fine-resolution biomass data (fig. S9). (iii) Potential concerns about endogenous variations of remote sensing observations due to calibration noise, sensor degradation, or environmental effects influencing estimation of temporal dynamics of biomass. Our predictor variables are processed multitemporal measurements such that seasonal and calibration effects (geometry) and potential noise are substantially reduced and there is no long-term drift in the signal of remote sensing data. Any short-term changes of remote sensing data due to moisture, optical properties, or other environmental factors may only introduce random errors at the pixel levels and do not influence annual estimations, regional trends, or interannual variability. These random errors are less important for estimating stock changes due to large-scale disturbance (deforestation, degradation, fire, and droughts) and may only increase the uncertainty of detecting year-to-year changes in areas with large biomass intact forests that were addressed earlier (see Materials and Methods). (iv) Drivers of disturbance affecting carbon emissions. In this study, we avoided the use of any classification of drivers of change and universal emission factors that may be subject to large uncertainty and instead relied on the definition of change from remote sensing observations. We first estimated emissions from forest cover change using pixel-based emission factors and Landsat 30-m product that, by definition, refers to complete clearing of forest cover (23). Next, we included emissions from burned area pixels from MODIS that do not overlap with the forest clearing pixels and applied average published combustion factors (see Materials and Methods). Last, we included carbon loss from tropical forest degradation that do not overlap with forest clearing and fire and used published emission factors for estimating annual emissions.
Global Stocktake and policy relevance
Annual and spatial carbon stock changes from our approach provide new insights to how the global carbon balance may be influenced by divergent regional processes. We have shown that estimates of emissions and removals from live biomass explain about more than 80% of global gross terrestrial fluxes, not only supporting the view that the use of live biomass carbon sequestration can be an important climate mitigation solution but also highlighting the vulnerability of ecosystem carbon storage and sinks. Our results also confirm that the largest gross fluxes occur in tropics, supporting international policies for reducing gross emissions from tropical deforestation and degradation and enhancing gross removals from secondary forests as a priority. The emerging patterns from our spatially explicit analyses are as follows: (i) Northern ecosystems are not as large a net sink of carbon as perceived in the past and the uncertainty in emission and removal factors may be larger than tropical forests; (ii) the often neglected nonforest woody vegetation in open woodlands, savanna, and shrublands has relatively larger contribution to terrestrial carbon sinks and sources than previously known. A major component of this contribution is in the interannual variability of stock changes that point to the dominant role of carbon removals in both moist and dry tropical ecosystems in IAV of global fluxes.
Despite the difficulty in distinguishing between land-use and environmental fluxes, our estimates of annual carbon stock changes and gross committed emissions and removals from live biomass can be directly linked to creating scale-dependent opportunities for climate mitigation solutions (regional versus subnational scales) (21). Now, the most common practice for carbon accounting at the national scale is to combine activity data and emissions and removal factors from IPCC guidelines (Tier 1) or from national level emission factors (Tier 2). Our approach will provide an improved framework to track carbon stock changes at local (10,000-ha area) to subnational scales and by renewing emission factors over time (hybrid Tier 2 and Tier 3) as new observations become available. This approach facilitates systematic and transparent carbon accounting, monitoring, and verification to inform nationally determined commitments under the Paris climate pledges and the forthcoming Global Stocktake in 2023. It also provides a consistent observation-based framework to meet the increasing interests for nongovernment and private entities to implement and monitor carbon offset investments. Our global carbon monitoring system also includes nonforest ecosystems not only to better balance the terrestrial carbon budget but also to allow countries dominated with nonforest ecosystems to develop capacity for carbon accounting and NCSs.
MATERIALS AND METHODS
We mapped live above- and below-ground biomass carbon stocks annually over all woody vegetation globally. Our approach was consistent with the IPCC carbon stock change method for GHG inventory and included spatial quantification of forest emissions and removals through a hybrid Tier 2 and Tier 3 approach (1). Estimation of global live biomass carbon stocks followed several processing, modeling, and estimation steps (fig. S1): (i) Synthesis and integration of a large number of ground inventory plots (>100,000) with airborne and satellite data as a consistent and systematic set of measurements of forest structure and vegetation above-ground live biomass (AGB). We applied models relating the lidar-derived metrics and radar backscatter to AGB estimates from ground plots across global forest types and ecoregions; (ii) development of spatially aggregated samples of woody vegetation AGB mean and variance of AGB at 10-km spatial resolution using satellite and airborne lidar as training data; (iii) estimation of AGB using the training data into spatiotemporal ML models by taking a globally consistent set of optical and microwave satellite data at 10-km spatial resolution as independent variables; (iv) conversion of AGB to below-ground live biomass (BGB) using vegetation specific allometric models (table S5) and carbon density from AGB + BGB multiplied by carbon fraction using IPCC guidelines (1); (v) estimation of net carbon fluxes using annual carbon stock changes and separation of gross carbon loss (committed emissions) and gross carbon gain (removals, as the residual of net carbon changes and committed emissions) using existing data on forest clearing from Landsat 30-m forest cover change (23), MODIS 500-m global burned area across all woody vegetation (44), and estimated tropical forest degradation (45); and (vi) formal error propagation and validation for estimating uncertainty at pixel, national, and global scales.
To define the study region, we used the land cover (LC) map from Collection-6 MODIS LC product (MCD12Q1) and delineated the interested LC (forests and woody nonforest) categories globally (58). To preserve disturbances such as forest clearing, degradation, and fire in their original LC status, we selected the annual MODIS LC map in 2001 as the reference map. To further distinguish the continental effects, we combined the original MODIS LC into continental biome types (fig. S2A). Such a reclassification is more consistent with the World Wildlife Fund ecoregion map spatially, which values the structure and diversity of vegetation types (59). We also generated a combined LC map to focus on the global effects (fig. S2B) of major vegetation types. Both maps were used to produce regional estimates of vegetation biomass and total carbon fluxes. A detailed description of each LC class was given in table S1.
Definitions and scales of analysis
The spatial grid cell for our analysis is 0.1 ° × 0.1°or approximately 10 km at the equator. Within the grid cells, we defined forest extent as areas greater than 30% tree cover and minimum height of 5 m at the mapping unit area of 1 ha using tree cover from 30-m Landsat data for the year 2000 (23). For regions or countries with different definitions of forest, the proportion of forest area and carbon stocks can be readily derived from the gridded products. For the nonforest woody vegetation including sparse woodlands, treed savannas, and shrublands, we used tree cover between 5 and 30%. We used 5% tree cover as a threshold to separate woody vegetation from all other LC types including grasslands, bare area, and settlements.
Footprint-level biomass sampling
We constructed vegetation structure (vertical height profile) samples [~8 million from the spaceborne Geoscience Laser Altimeter System (GLAS) product (60) aboard the Ice, Cloud, and land Elevation Satellite (ICESat), and data from L-band radar backscatter (61) from the ALOS (Advance Land Observation Satellite, “DAICHI”) PALSAR (Phased Array L-band Synthetic Aperture Radar sensor) sensitive to low-density vegetation biomass (100 Mg/ha) (62)]. For forested regions, we designed careful strategies to build lidar-biomass allometric models relating the GLAS-derived Lorey’s height (basal area weighted height) (60) to plot-level AGB for all biome types using valid GLAS shots from 2003 to 2008 that were processed by filtering out noisy shots with large errors (effect of cloud, topography, etc.) (60). A circular footprint with an average size of 0.25 ha (56 m in diameter) was used as a mean effective footprint size for all GLAS lasers. The relationship between AGB and Lorey’s height takes on the form of AGB = αHβ + ε, where H is the Lorey’s height, α and β are fitting parameters, and ε is a residual term. We applied 39 allometric models (table S5) based on plant functional types and developed region-specific models based on plot-level AGB data availability (table S6).
Over tropical forests, we performed the GLAS lidar calibration by using ALS data collected over tropical forests of South America, Central Africa, and Southeast Asia across an area of more than 1 million ha with sample size of 400 to 2000 ha each. For GLAS footprints falling on ALS data, we calibrated values of Lorey’s height to estimated AGB from tropical airborne campaigns. The ALS-based allometric models were estimated from numerous research plots spread across South America, Africa, and Southeast Asia and documented in related studies (25, 30, 63–65). These research plots span various forest types including old growth tropical moist and wet forests, woodland savanna, dry forests, peat swamp and wetland forest, and forests recovering from disturbances. In addition to a much larger number of plots and ALS data for calibration of GLAS data, we also included regional variations of average wood density of plots as a weighting factor in the model to make sure that the height-biomass models are all adjusted with regional differences in tree wood density (25, 30, 63, 64).
In temperate and boreal forests, we were able to use field inventory data to develop models based on groups of plant functional types. In North America, allometric models were developed by the U.S. Forest Inventory and Analysis (FIA) program of the U.S. Forest Service, inventory data from Mexico, and the data available across the Canadian forestlands (table S6) (66, 67). In the United States, the FIA program has more than 120,000 permanent plots across temperate and boreal (Alaska) ecosystems in both forests and nonforest landscapes and a wide range of biomass and vegetation types. Detailed information about the FIA plots is available publicly (67). Similarly inventory data distributed globally were used for different forest types (table S6). For dry broadleaf and tropical conifer forests, we used forest inventory data in Mexico. For boreal forests of Eurasia, we divided the models for western (northern Europe) and eastern (Russian Siberia) using plots from Norway and inventory data from the State Forest Account of Russia, respectively. For broadleaf deciduous and mixed forests of China, we used 60 field plots to develop the models. These models were then used to estimate AGB from GLAS-based Lorey’s height. For those regional vegetation types that we could not find ground plots to develop lidar-biomass models, we used models for similar vegetation types in other regions or continents.
For nonforest regions where GLAS footprints had no coverage, we opted to use random AGB samples estimated from a combination of ground plots, airborne lidar, and ALOS PALSAR data. The ALOS PALSAR backscatter products are available with global coverage for the periods of 2007 to 2010 and 2015 to 2018. The L-band radar backscatter is strongly correlated with vegetation biomass when AGB is less than 100 to 150 Mg ha−1 depending on vegetation types (62), and we have developed a suite of models between ALOS backscatter and AGB using ground and airborne lidar data (62).
Grid cell biomass estimates
Assuming that GLAS-derived lidar samples (footprints) contain independent information from shot to shot, the grid cell mean and variance can be computed by including enough GLAS shots in each grid cells (68). All grid cells with estimates of mean biomass with at least 25 GLAS footprints were used as reference data for the next step of mapping biomass. The aggregation of pixel-level AGB at 10-km pixel can be biased if we take an immediate sampling mean from all available footprint-level AGB. This bias is due to unbalanced sampling of GLAS shots (marked as valid only over forest canopies) within the 10-km pixels because a pixel can include both forests and nonforests and GLAS undersamples the nonforest areas. The unbalanced sampling will introduce overestimations of pixel-level AGB. Here, we used the 250-m-resolution MODIS Vegetation Continuous Fields (VCF) product to approximate the percent tree cover at both the footprint level and pixel level (at 10-km resolution) to correct the AGB estimates
(2)where VCFpixel is the spatial mean VCF at 10-km spatial resolution from the original MODIS VCF (250 m),
is the average VCF of all locations of valid GLAS shots, and
is the AGB averaged over all shot-level AGB within each valid 10-km pixel. The ratio of VCF is set to no larger than 1.2 (derived from a set of detailed analysis of samples with high-resolution data) such that the correction of pixel-level AGB will not be overestimated. The processed GLAS data gave us a global training dataset of pixel-level AGB over forests and most woody savannas (fig. S10A).
The use of a minimum of 25 lidar samples (~0.25 ha) within each 10-km grid cell allows estimation of mean biomass at the landscape scale with small uncertainties (22). The minimum area of 10-km grid cells (100 km2) is designed to have a larger sample size to estimate AGB (fig. S10B). Our methodology mimics the approach used in forest inventory ground sampling that allows design-based estimates of mean and variance of biomass at the landscape scale (22). At the native resolution of the remote sensing pixels (e.g., 500 m), the estimate of AGB will be from only one or two GLAS shots, thereby introducing large uncertainty in AGB estimate and create noisy (large random error) and, in most cases, AGB estimates with systematic errors. By using larger pixels, we will increase the number of GLAS shots as inventory samples to estimate the mean of the population and hence reduce the uncertainty of AGB used as training data. All AGB estimates in grid cells with more than 25 samples are used to train the remote sensing data in the biomass prediction process and hence produce carbon maps with reduced uncertainty at the pixel level. Note that at 10-km grid cells, the range of mean AGB is much smaller due to the LC heterogeneity. For nonforest regions, we selected samples of ALOS-derived AGB that matches the sampling density of GLAS data following a spatially random selection scheme (fig. S10). Since the ALOS-derived samples are only sensitive to small biomass vegetation (62), we set the upper threshold of ALOS-derived AGB to be 75 Mg/ha at 10-km resolution. The 10-km training dataset also varies year to year by matching the year of GLAS data acquisition with the remote sensing data layers.
Satellite data processing
We processed wall-to-wall time series of satellite observations from multiple sensors at different spatial resolutions to develop a globally consistent average spectral information at 10-km grid cells from 2000 to 2019. The three major satellite observations that have annual time series are the MODIS Collection-6 Nadir BRDF-Adjusted Reflectance (NBAR) data (product name: MCD43A4 v006), the day and night MODIS Land Surface Temperature (LST) data (product name: MOD11A2 v006), and the radar backscatter data at H-polarization from ascending passes of the SeaWinds Scatterometer on QuikSCAT (QSCAT HA). To further capture the growing season in the northern lands and seasonality in the tropics, we added additional layers of growing-season averages (April to October) and annual SD from all three products. Other time-invariant (fixed) layers, including the long-term means of all multitemporal variables; the bioclimatic variables from the WorldClim climate database (69), including the annual mean temperature, temperature seasonality, mean temperature of coldest quarter, annual precipitation, precipitation seasonality, precipitation of wettest quarter, and precipitation of driest quarter; and the global digital elevation model from the Shuttle Radar Topography Mission (SRTM) product (70) were also processed to capture the long-term climatology, general climate conditions, and topographical features.
Time series data from satellite observations may have missing data points regionally or at pixels due to lack of observations or data quality and cloud contamination. During our study period, all of time series datasets have the least missing data from 2003 to 2009, except a few missing values in MODIS LST and NBAR due to quality issues. We thus used this time period as the base period for gap filling models. Using all the complete cases, we applied ML models using boosted trees (71) and imputed missing data by inferring the information for missing pixels from other layers. We used all data from 2003 to 2008 as training and data in 2009 as independent validations. Predictions of annual mean data have high accuracies [R2 (coefficient of determination) > 0.95 for all cases], while the estimates of annual SDs are slightly less accurate (R2 > 0.8 for most cases).
For QuikSCAT data, the original products have observations extending from 2000 to 2009. Therefore, we filled the gaps between 2010 and 2019 using the same gap-filling method. Besides using inference from existing layers as we processed for other datasets, we also included additional layers from (i) GRACE data, which provides a global half-degree latitude-longitude grid of monthly anomalies in equivalent water thickness relative to the baseline average over January 2004 to December 2009 (72), and (ii) TRMM data, which combines rainfall estimates from TRMM and other satellites and provides gridded rain gauge data at the monthly frequency and 0.25° × 0.25° spatial resolution starting from 1998. Aiming at improving the reliability of predictions for these missing years of QSCAT, the temporal signals gained from additional layers helped to reconstruct QSCAT data with comparable accuracies to other layers that contain missing data.
We used AGB training data at the grid cells within the period of GLAS campaign (2003–2008), and airborne lidar data acquired from 2010 to 2017 in different regions across the tropics to build a model to spatially map AGB at the designed 10-km grid cells globally. The corresponding satellite observations were extracted for each AGB training grid cells and for the year AGB observation. For small biomass density regions that lacked GLAS data, we used available airborne lidar data and ALOS-derived AGB, the satellite data layers were averaged to match the observational period of ALS and ALOS-derived AGB.
Globally, we obtained 87,528 pixels of AGB training data, with 55% of them from GLAS and the rest from ALS and ALOS-derived data (fig. S10). We treated all training data as pooled cross-sectional data for modeling to increase the amount of multitemporal training data. Assuming that the spatial modeling unbiasedly predicts AGB variations spatially and temporally, we trained AGB pixel samples using the random forest (RF) model (73) with correction for estimated bias (30) and predicted AGB over the entire study period from 2000 to 2019 from the preprocessed satellite data layers including MODIS LST, NBAR, QSCAT HA, and other time-invariant layers.
The spatial modeling approach is based on the RF algorithm as an ensemble of decision trees trained from randomly selected subset features and random sampling of the training set using bagging method (73). RF can be a regression method when using regression trees, and for the jth regression tree, the regression model can be built as
(3)where x ∈ X is the bagged samples of the training set and fj( ∙ ) is the nonparametric function determined by the jth regression tree. The final prediction of RF regression is the unweighted average of the collection of trees
This averaging process creates results that are skewed toward the sample mean, and large/small values of y are often underestimated/overestimated. In our study, we modified the bootstrap method for correcting the estimated bias and implemented a second RF run to correct the estimated biases. The additional correction (30) on classical ML is an effective treatment for regression dilution issues, and the correction term can be written as
is the corrected term and the residual term
is the difference between the original y and the prediction
, which can be estimated from out-of-bag (or cross-validated) residuals of training data. The final corrected prediction can then be expressed as
The correction term tries to capture the systematic regression errors of the original RF by estimating the new metric (
) that is further skewed in the opposite direction of the original observation y. Such a correction model is an unbiased estimator of the interested variable when noise exists in the independent variables but has a small amount of sacrifice on prediction accuracy (30). We consider the spatiotemporal training of the RF model and the bias correction as a continuously improving (self-improving) estimation approach suitable for developing long time series AGB maps. When we add spatial and temporal observations such as from the new lidar data from NASA’s Global Ecosystem Dynamic Investigation mission or national inventory data, the improved ML model will run the entire time series and produce biomass maps with reduced uncertainty across the entire spatial domain.
From pixel level AGB estimates from 2000 to 2019, we further estimated BGB using the root-to-shoot ratio developed from measurements in different woody vegetation types globally (table S5). We used 19 root-to-shoot models for all forests and shrublands, and in the absence of data for any vegetation types, we used the IPCC default values (25). With the knowledge of AGB and BGB, we estimated the total carbon stock of live vegetation as follows
(7)where the factor F = 0.49 is the average fraction of carbon contained in the dry biomass (25).
Carbon flux estimation
Given the annual carbon stock estimates, we were able to estimate the net flux of carbon from live biomass as the first difference of annual carbon sequence (ΔCnet, yr2 = Cyr1 − Cyr2) for the years 2001 to 2019. This annual change follows the IPCC guidelines for carbon stock change at the pixel scale rather than at the national scale. The net change of biomass is from a variety of processes, including biomass loss from forest disturbance due to land-use and environmental factors and biomass gain from postdisturbance recovery, intact forest gain, and other exchanges that may occur between different carbon pools. If forest gain offsets forest loss within a pixel, then the net carbon stock change at the pixel level will be negligible. Therefore, this methodology will not provide estimates of gross carbon fluxes.
To estimate gross carbon fluxes, we included subpixel information from finer-resolution remote sensing on forest cover change (forest clearing and degradation) and forest and nonforest fire. Throughout the paper, we avoided using the land-use terminology such as deforestation for forest clearing as large areas of forest clearing in the north are used for timber harvesting and the forest remains forest through this process. The steps to estimate the gross annual carbon loss (committed emissions) and the annual carbon gain (gross removals) are shown below.
We used the Landsat-based global forest change product (23) at its original resolution (~30 m), calculated the number of pixels that showed forest loss in each year using the loss-year data layer, and spatially aggregated the layers at the original resolution to obtain the percentage of cleared area (PCA) at each 10-km pixel from 2001 to 2019. Note that forest clearing pixels may be attributed to areas of timber harvest (a large proportion of the forest clearing in temperate and subtropical regions) and other types of disturbance such as storms, insect, and droughts that may introduce a lagged effect in the emissions to the atmosphere.
Degradation in tropical moist and dry forests
We developed annual estimates of degradation using data from the pantropical degradation project detecting intact forest loss between 2000 and 2013 (45). We set the degradation fraction for each 10-km pixel calculated from (45) as the training data and applied a ML model using boosted trees (71) to extend the available degradation data annually over the entire time series. For the model training and predictions, we used the same predictor layers used for biomass mapping, with the exception of using only the differences of data layers from 2 years to improve detection changes of forest degradation between 2 years. The cross-validated results show that the predicted degradation has a high correlation (R2 = 0.77) with the original input data, and the prediction included the percentage of degraded area (PDA) annually from 2001 to 2019.
Forest and nonforest fire
We aggregated the MODIS burned area (product name: MCD64A1 v006) product to 10-km resolution by keeping the percentage of burned area (PBA) in each pixel from 2001 to 2019. If there was a forest-clearing event occurring in the same pixel that showed positive BA at the MODIS resolution (500 m), we treated it as forest clearing instead of fire to avoid double counting the carbon loss from fire and forest loss. Using Landsat-based forest cover change over MODIS burned area for overlap pixels is based on the finer resolution (30 m) of Landsat pixels and the definition of cover change being complete clearing of the forest. Such a correction may result in a conservative estimation of fire emissions but avoids possible overestimation as the committed emission factor of forest clearing can be much higher. We also performed the test showing that there is only a small fraction of pixels having both forest clearing and fire events (fig. S6D). On average, only 17% of the forest clearing area had coexisting fire events from 2001 to 2019, which translates to 12% of the committed carbon emission from forest clearing. The overlap of forest clearing from Landsat and burned area also suggests that any error in confusing the partial clearing from fire with the total clearing in Landsat forest cover change algorithm may cause overestimation of emissions particularly in northern boreal regions.
Emission efficiency factors
To estimate the committed emissions of carbon from forest clearing, degradation, and fire events, we used the emission efficiency factors from literature (5, 24, 45) for forest clearing (fC = 1), degradation (fD = 0.07), and fire (fB = 0.3). To avoid double counting, annual area of disturbance for each grid cell was calculated by first including forest clearing, then fire area not overlapped by forest clearing, and then degradation area not overlapped by clearing and fire. We define the emission efficiency factor as the fraction of committed emission during the disturbance event versus the total carbon available within the vegetation being burned/cleared. We added 10% variations around the mean emission efficiency factors to introduce uncertainty in forest clearing, degradation severity, and fire combustion factors, as disturbances can be highly variable even for a single event (74). The total emissions from forest clearing (EFC), tropical forest degradation (EFD), and fire (EFire) were then estimated using the bottom-up modeling approximation
(8)where Ci is the total live biomass carbon from the annual carbon stock mapping for pixel i, PCA (or PDA and PBA) represents percent of disturbed area within the 10-km pixels, and EFC (or EFC and EFire) represents the emission from forest clearing (or degradation, fire) for the corresponding year. The estimation of fire emission has a major caveat at 10-km resolution in mixed forest/nonforest regions for uneven distribution of fire events. We therefore separated each 10-km pixel into two fractions—forest and non-forest fractions—and denoted the forest (or nonforest) carbon within each pixel as CF (or CNF). In this case, the fire emission for each pixel can be redistributed as
The fractions of PBAF and PBANF were estimated using the MODIS VCF product for each year, in which the percent tree cover layer stores information about the forest and nonforest fractions. For the allocation of carbon in CF and CNF, we first obtained the ratio R0 = CF/CNF from the existing fine-resolution global carbon stock map (25), and with the knowledge of carbon stock Ci for the 10-km pixel i in each year, we have
The separation of forest and nonforest regions provides more accurate estimates of fire emissions based on the fact that more fires (at a finer resolution) tend to occur in nonforest regions. If such a mixed-pixel effect is not considered, then the fire emissions can be overestimated.
Adjusting carbon changes in intact forests
Our estimates of carbon stocks have no systematic error or underestimation of large biomass forests because of large grid cells. However, the existing pixel-level random error of biomass prediction may not capture the very slow biomass gain and IAV of biomass in these forests. Especially in large biomass forests, the pixel-level uncertainty is relatively higher than small biomass regions (fig. S7). For these intact forests, we directly apply the biome-level growth rates for biomass carbon based on a combination of existing inventory data from temperate and boreal regions, research plots across tropics, and the recently revised IPCC guidelines (1, 29) to adjust long-term estimates of carbon removals. The definition of intact forests includes two scenarios: (i) forested regions with pixel-level AGB density larger than 200 Mg ha−1 and (ii) forested regions with pixel-level AGB density larger than 200 Mg ha−1. For both scenarios, we removed the area fractions that experienced disturbance events, including deforestation, fire, and degradation. The remaining forested areas in each scenario were regarded as intact forests, and we applied the region-specific growth rates to obtain the long-term estimates of carbon changes (table S4).
Our uncertainty assessment of the carbon stocks and gross sources and sinks accounts for errors in the data and models and propagating errors spatially and for large-scale inferences. There are various sources of uncertainty in our mapping process and vegetation carbon estimates (table S7), including the lidar-biomass model estimation that may be developed directly from plot data or airborne lidar and plot data including the uncertainty of estimating below ground from the AGB (model A), spatial mapping estimates using the ML model (model B), C stock change estimates from the time-series analysis (model C), and estimation of C emission and removal fluxes (model D). Models A, B, and C all take the general form of equation for estimation
(11)where for each pixel i, the estimation yi is dependent on the model f(…) with predictor variables Xi and model parameter set β, and εi is the residual term representing the difference between the prediction and the observation. For pixel-level uncertainty, it includes the sources of error related to model residual variance
, parameter estimation error variance
, and data uncertainty (measurement error) from predictor variables
Assuming that predictor variables from remote sensing measurements have unknown but negligible errors (calibration of remote sensing data at the spatial scale of our study can be ignored), we focus on estimating the first two variance terms in Eq. 12. The residual variance
can be modeled using the following approximation that considers heteroscedasticity
(13)where θ is the model parameter to be estimated from the cross-validation process for the comparison of training data to model predictions. Model A usually reports this uncertainty only, and θ (relative uncertainty with respect to the actual estimation) ranges from 5 to 90% (table S5) with an average error of the order of 10% (14). To investigate the carbon changes detected from satellite signals, we do not include this model A uncertainty in our reported numbers and treat the existing residual (difference between the measurement and estimation) as a consistent bias of the target variable (lidar samples) in model B. The residual variance of model B is one of the direct outputs of regression ensemble (73), and we are able to produce pixel-level error maps annually derived from the model residuals.
The model-related uncertainty
is due to the sampling variability from available training data, which would produce different sets of model parameter estimates that lead to the uncertainty of the machine learning model itself. Model A, because of the small sample size of plot-level estimates, has limited information on this uncertainty. Model B, as the nonparametric model, can use the bootstrapping procedure to estimate the sample-induced modeling uncertainty. RF model is essentially a bootstrapping model in which each decision tree generates bootstrapped samples to build decision rules (73). With the minimal modification of the existing package, we can estimate the model uncertainty at both the pixel scale and regional scale (22). In our study, the interested variable in model C is the slope of linear regression over time. Therefore, model parameter uncertainty is the term that we care to solve. Inheriting the bootstrapped samples built in model B, we propagate the uncertainty established in model B to estimate the parameter uncertainty in model C.
The scaling of pixel-level estimates to regional means needs considerations of pixel-level covariance of both the modeling uncertainty and the residual term (22)
(14)where MSE is the mean squared error of the regional mean estimate from a total of N pixels. The two covariance terms are sources of errors related to model residuals and model predictions. The residual variance and covariance (components related to
) are related to pixel-level prediction residuals and associated spatial autocorrelations for regional estimates, and the prediction-related (
) covariance term can be readily estimated using bootstrapping procedure to account for both the model parameter uncertainty and the corresponding covariance between pixel predictions.
In addition to these algorithmic uncertainty estimates, there are several sources of error that may influence the results of our analysis and are addressed in Discussion. These include the potential uncertainty in detecting forest biomass gain in intact high biomass forests (e.g., moist tropical forests), offsetting gain and loss in large grid cells, and potential influence of environmental effects on the interannual variability of carbon stock changes, particularly in areas of high biomass intact forests where biomass changes are small.