Plot size matters: Toward comparable species richness estimates across plot‐based inventories

Abstract To understand the state and trends in biodiversity beyond the scope of monitoring programs, biodiversity indicators must be comparable across inventories. Species richness (SR) is one of the most widely used biodiversity indicators. However, as SR increases with the size of the area sampled, inventories using different plot sizes are hardly comparable. This study aims at producing a methodological framework that enables SR comparisons across plot‐based inventories with differing plot sizes. We used National Forest Inventory (NFI) data from Norway, Slovakia, Spain, and Switzerland to build sample‐based rarefaction curves by randomly incrementally aggregating plots, representing the relationship between SR and sampled area. As aggregated plots can be far apart and subject to different environmental conditions, we estimated the amount of environmental heterogeneity (EH) introduced in the aggregation process. By correcting for this EH, we produced adjusted rarefaction curves mimicking the sampling of environmentally homogeneous forest stands, thus reducing the effect of plot size and enabling reliable SR comparisons between inventories. Models were built using the Conway–Maxell–Poisson distribution to account for the underdispersed SR data. Our method successfully corrected for the EH introduced during the aggregation process in all countries, with better performances in Norway and Switzerland. We further found that SR comparisons across countries based on the country‐specific NFI plot sizes are misleading, and that our approach offers an opportunity to harmonize pan‐European SR monitoring. Our method provides reliable and comparable SR estimates for inventories that use different plot sizes. Our approach can be applied to any plot‐based inventory and count data other than SR, thus allowing a more comprehensive assessment of biodiversity across various scales and ecosystems.


| INTRODUC TI ON
As human activities continue to trigger rapid climate change as well as high biodiversity turnover and extinction rates, monitoring biodiversity has become a high priority for the scientific community (IPBES, 2019). The many monitoring programs that were developed over the last decades at local-to-national scales provide valuable information on the state and trends of biodiversity.
Merging these existing datasets and comparing outputs beyond the local or national scale are essential to study biodiversity in a more comprehensive way in order to reliably inform decision and policy makers (Lengyel et al., 2008;Winter et al., 2008). In fact, producing robust outputs from merged datasets is a key objective of main European policy strategies (European Commission, 2021).
However, it is often hampered by the fact that sampling designs, field protocols, and estimation procedures can differ strongly from one inventory to another. Thus, comparability of the many existing datasets constitutes a major challenge. For example, sampling designs do not always follow a spatially systematic (gridded) design, can have different inclusion probabilities, or different plot sizes. Tackling these differences is challenging but important because sampling designs can strongly affect derived biodiversity indices.
Disregarding sampling design effects can thus lead to deceptive results. With an increasing demand for comparable information on biodiversity, the need for developing methods allowing inference across inventories becomes urgent.
Species richness (SR), here referring to the number of species occurring in a given area, is an intuitive diversity index commonly used as a proxy for biodiversity components. It is a highly important ecological indicator shown to be a key driver of ecosystems' resilience (Oliver et al., 2015). For instance, forests with higher SR suffer less from the impact of disturbances and are thus better able to retain their carbon stocks, which is a crucial mitigation strategy in the face of climate change (Guyot et al., 2016;Silva Pedro et al., 2015). It is, however, surprisingly difficult to provide robust estimates of SR from plot-based monitoring data (Gotelli & Colwell, 2011). The size of monitored plots directly affects how many species can be found (Hill et al., 1994) because SR increases non-linearly with the area of the sampling units (Dengler, 2009;Gotelli & Colwell, 2001). For this reason, SR estimates should always be provided along with the corresponding plot size. Problems arise when, for instance, one wishes to compare the tree SR of two different countries, but one country uses 250 m² forest plots as sampling units while the other relies on 500 m² plots. A lower reported mean SR across all plots in the first country may be an ecological fact, or it may instead be rooted in the smaller plot size.
Avoiding such uncertainties is utterly important because political decisions regarding biodiversity are based on indices such as the mean SR, although they are often not directly comparable across inventories. A simple solution to the problem exists when the location of each sampled individual is known within the plots. In this case, the size of larger plots can artificially be decreased to match the size of the smallest plots. Following our previous example, the size of the second country's plots could be artificially decreased from 500 to 250 m², allowing for a direct comparison of its SR to that of the first country. However, not only is the location of every individual rarely available in monitoring programs, but this process leads to an immense information loss: larger plots would always need to be downscaled to the smallest sampled plot size. In this context, obtaining reliable comparisons of SR between inventories using different plot sizes must involve upscaling SR to larger areas. We suggest estimating species-area relationships from rarefaction curves built on aggregated plots as a robust approach to do so, allowing in our previous example to compare the tree SR of both countries for the same given area.
Species-area relationships are a well-established concept in ecology and biodiversity science (e.g., Dengler, 2009;Gotelli & Colwell, 2001;Stein et al., 2014;Tittensor et al., 2007). They are commonly built from a nested design, where the area of a plot is gradually increased in a continuous manner and SR is calculated accordingly (Dengler et al., 2020;Gotelli & Colwell, 2001). In inventories relying on a network of independent sample plots, increasing area is achieved by aggregating non-contiguous plots. The relationship between the area of aggregated plots and their corresponding SR is referred to as sample-based rarefaction curves (Crist & Veech, 2006;Gotelli & Colwell, 2001;Steinmann et al., 2011). Plot size determines how many plots must be aggregated to reach a given area.
In our previous example, the mean SR of the second country for an area of 500 m² simply corresponds to the mean SR across all individual sampled plots. However, it corresponds in the first country to the mean SR of many combinations of two randomly aggregated plots. It is crucial to note that any two plots of the first country that together make up the 500 m 2 can potentially be located in very different habitats and subject to different environmental conditions. This makes it likely to find different sets of species, generally leading to a higher combined SR in two different plots of smaller size than in one homogeneous plot of larger size. In other words, the smaller single plots are, the more different environments are likely to be sampled for a given aggregated area, and the faster SR accumulates with area in rarefaction curves (Gotelli & Colwell, 2001;Steinmann et al., 2011). As a consequence, this prevents direct comparisons of rarefaction curves across inventories that use different plot sizes.
Here, we hypothesize that we can produce comparable, adjusted rarefaction curves by controling for what drives the difference in species found in aggregated plots: environmental heterogeneity and spatial configuration.

T A X O N O M Y C L A S S I F I C A T I O N
Applied ecology; Biodiversity ecology; Theorectical ecology Environmental heterogeneity (EH) is a key determinant of the shape of sample-based rarefaction curves. Simply because it is likely that a wide range of environmental conditions is encountered when aggregating independent plots, the probability for additional species to occur rises (Drakare et al., 2006;Stein et al., 2014;Steinmann et al., 2011). Plots located in regions that differ in terms of climate, topography, soil conditions, or vegetation structure are likely to also differ in terms of species. While there is a growing consensus that EH across aggregated plots increases SR (Stein et al., 2014), disentangling this effect from the effect of plot size remains challenging (Steinmann et al., 2011). The second important aspect that affects the shape of rarefaction curves is the spatial configuration of aggregated plots, i.e., the geographic distance between plots that have been aggregated and the resulting spatial extent of the total area that has been sampled (Güler et al., 2016). This can represent legacy effects such as demographic processes, dispersal limitation, colonization probabilities or speciation, and extinction processes that all affect the capacity of a given species to occur at a given location (Drakare et al., 2006).
In order to compare SR across inventories that use different plot sizes, the key challenge is therefore to develop a method accounting for EH and the spatial configuration of aggregated plots when building rarefaction curves.
In this study, we propose a methodological framework for modeling SR as a function of area, EH, and spatial configuration, thus accounting for a variety of ecological factors known to affect SR ( Figure 1). Our approach artificially removes the plot size-dependent effects of EH and spatial configuration that are introduced when aggregating plots. The resulting adjusted rarefaction curves are then independent of plot size and represent the relationship between SR and area as if the aggregated plots were subject to similar environmental conditions. In other words, the proposed methodological framework mimics a situation where species were recorded from a single, large, environmentally homogeneous plot. Although investigating the effect of EH on rarefaction curves would be another interesting avenue itself, here we focus on developing a method to obtain reliable comparisons of SR estimates between inventories with different plot sizes. The wealth of data collected within National Forest Inventories (NFIs) and the growing relevance of the NFI community for large-scale biodiversity reporting provide a valuable opportunity to develop and test this approach (Corona et al., 2011;Vidal, Alberdi, Redmond, et al., 2016). NFIs are conducted in most European countries (Tomppo et al., 2010;Vidal, Alberdi, Redmond, et al., 2016), but each differs with respect to sampling design settings such as plot size, field protocols, or estimation procedures Winter et al., 2008). We gathered NFI tree species occurrence data from Norway, Slovakia, Spain, and Switzerland, covering a wide ecological and bioclimatic gradient extending over seven of the ten European biogeographical regions (European Environment Agency, 2016). NFI data are also a rare case where the location of individuals within plots is known, bringing a unique opportunity to validate our approach, which is essential to demonstrate its general applicability for other monitoring programs and their data.

| National Forest Inventory data
Most NFIs consist of country-specific networks of independent, systematically distributed sample plots and are representative of the forested area of each country (Tomppo et al., 2010;Vidal, Alberdi, Hernández Mateo, 2016). As our approach, based on NFI data from Norway, Slovakia, Spain, and Switzerland, relies on sample plot size, we excluded all NFI plots for which size was reduced by a road, a river, or any other natural or anthropogenic barrier. In Spain, only plots located in the Peninsula were used. The total number of selected plots and species in each country is presented in Table 1. We defined SR at the plot level as the number of tree species recorded in a plot.
We performed further data harmonization steps to control for the main sampling design differences between countries. Some NFI plot configurations are based on concentric circles associated with different diameter at breast height (DBH) thresholds (e.g., Spain, Slovakia, and Switzerland), on fixed-area plots (e.g., Norway) or on angle count sampling. To minimize the effect of DBH thresholding and to base our analyses on fixed areas, we selected only one circle in the NFIs that use concentric circles. In Switzerland and Slovakia, we retained the 200 and 500 m² circles, respectively, in both cases associated with a DBH threshold of 12 cm. In Spain, we used the 10-m-radius circle (314 m², DBH threshold = 12.5 cm) that we cropped at 300 m² to facilitate subsequent analyses requiring 50 m² increments. In Norway (plot size = 250 m²), we removed trees with a DBH <12 cm to harmonize the DBH threshold throughout the countries. Note that NFI data used in this study are not necessarily the ones used for international reporting, such as in the State of Europe's Forests report (FOREST EUROPE, 2020). All analyses were performed in R.3.6.3 (R Core Team, 2020).

| Building sample-based rarefaction curves (Figure 1a,b)
We used these NFI data to build sample-based rarefaction curves (Gotelli & Colwell, 2001, representing the relationship between SR of aggregated plots and area. To this end, we randomly sampled and aggregated independent NFI plots within each country ( Figure 1a) to incrementally increase area from that of a single plot to a maximum of 10,000 m² (1 ha). We did not limit the spatial extent of the aggregated plots to smaller regions within countries as the goal was to compare the SR of whole countries, as per the State of Europe's Forest report (FOREST EUROPE, 2020). As plot size differed between countries, the number of plots required to reach 1 ha varied between countries (from 20 in Slovakia to 50 in Switzerland).
Aggregated plots are hereafter referred to as "mega-plots." Contrary to Slovakia and Switzerland where NFI plots are evenly distributed on an equidistant grid, the grid resolution of Norway (Breidenbach et al., 2020) and Spain (Alberdi et al., 2017) differs between regions (i.e., sampling strata). In these countries, the random

Methodological framework
sampling for generating mega-plots was adjusted to account for region-specific sampling intensity, such that the chances of a plot to be drawn in the aggregation process were reciprocally proportional to the grid density. That means, plots in a stratum with a wider sampling grid had a larger chance to be drawn than plots in a stratum with a denser sampling grid. The random sampling was repeated 500 times, resulting for each country in 500 mega-plots per aggregation step ( Figure 1a). We calculated the SR of mega-plots at each step.
Mega-plots were used to build country-specific sample-based rarefaction curves (Figure 1b), representing the relation between the average SR over the 500 equal-sized mega-plots, and the size of these mega-plots (Crist & Veech, 2006;Gotelli & Colwell, 2001).

| Quantifying the EH contained in mega-plots (Figure 1c)
We used a broad set of environmental variables representing climate, topography, soil, and stand structure to cover the main factors shaping spatial patterns of SR (Field et al., 2009). Climate data were downloaded from Karger et al. (2017). We included mean annual precipitation and mean growing season temperature based on monthly data for the years 1979-2013. Topography variables were derived from a pan-European digital elevation model (DEM) with a spatial resolution of 25 m (EU-DEM, 2018). We considered slope inclination and Beers aspect, the latter being an ecologically meaningful index ranging from 0 (southwestern slopes, warm) to 2 (northeastern slopes, cold). Beers aspect was calculated from aspect shifted by a 45 degrees angle (Beers et al., 1966): Soil properties were represented by topsoil pH. For Spain and Slovakia, we downloaded topsoil pH data from the European Soil Database, containing maps of topsoil properties based on the Land Use and Cover Area frame Survey (LUCAS) (Ballabio et al., 2019). This database does not provide data for Norway and Switzerland, for which we downloaded topsoil pH from SoilGrids, a global grid of soil information (Hengl et al., 2017). We further used basal area per hectare (BA) as an indicator of forest stand structure, representative of attributes such as stand density, light conditions, and competition at the local scale.
EH measures were derived from these variables for each megaplot to depict the differences in environmental conditions representative of climate (H clim , described by annual precipitation and mean growing season temperature), topography (H topo , described by slope and Beers aspect), soil (H soil , described by pH), and BA (H BA ) across the aggregated plots. EH measures were calculated in three steps.
First, the scaled value of each environmental variable was extracted for each plot. Second, within each mega-plot and independently for each EH measure (i.e., H clim , H topo , H soil , and H BA ), we calculated the pairwise Euclidean distances between the values of the environmental variables of each pair of aggregated plots (Figure 1c). The number of pairwise distances for a mega-plot made of n aggregated plots is (n * (n − 1))/2. Pairwise distances were calculated as: where dist i,j represents the Euclidean distance between the scaled values of environmental variables 1 and 2 (e.g., precipitation and temperature for H clim ) of plot i and plot j. For H soil and H BA that both only contain one variable (soil pH and BA, respectively), the same formula using only Variable1 was applied. Third, the final EH measures of each mega-plot k were calculated as the mean of all pairwise Euclidean distances between the n plots aggregated: The larger an EH measure of a given mega-plot is, the more environmentally different the aggregated plots composing this mega-plot are. An additional EH measure representing the spatial configuration of aggregated plots was calculated to reflect that species composition does not depend solely on climate, topography, soil, and BA conditions. First, we calculated H geo using latitude and longitude as input variables in Equation (2), representing the geographic distance between aggregated plots. This geographic distance, in addition to controling for spatial autocorrelation in species composition, is considered a proxy for legacy effects and other environmental factors that can affect species (1) Beers aspect = 1 + cos(45 − aspect) F I G U R E 1 Conceptual figure describing the methodological steps taken to remove the effect of EH introduced in the aggregation of plots. These steps were applied independently to each country, but are shown in this figure for Switzerland as an example. Each step is further described in the Materials and Methods section TA B L E 1 Characteristics of the NFI data used in the analyses for Norway, Slovakia, Spain, and Switzerland. These characteristics do not correspond to the full original NFI dataset of each country, but to the NFI plots used in our analyses after we performed data harmonization steps

| Modeling SR as a function of area and EH (Figure 1d)
The SR distribution of each country was underdispered (i.e., less variation in the data than expected under a Poisson distribution - Figure   A1). We therefore assumed that SR followed a Conway-Maxwell-Poisson (CMP) distribution, which is a two-parameter extension of the Poisson distribution that is able to model underdispersed count data (Sellers & Shmueli, 2010;Shmueli et al., 2005). The CMP distribution relies on a parameter CMP > 0 and on a dispersion parameter ≥ 0. When = 1, the CMP distribution is equivalent to a Poisson distribution. Values of < 1 correspond to overdispersion, and values of > 1 to underdispersion (Shmueli et al., 2005). The SR of mega-plots k was thus specified as: Independently for each country, we performed a CMP regression on the whole set of mega-plots based on maximum-likelihood estimation using the glm.cmp function of the "COMPoissonReg" package (Sellers & Raim, 2016). This implementation allows a simultaneous estimation of ̂ k and ̂ , which later on can be used to estimate the mean predicted SR (see below). Rather than directly on SR k , the glm.cmp function performs the regression on ̂ k . We assumed to be constant, i.e., was not made dependent on the variables used in the model.
Explanatory variables included area of mega-plots, H clim , H topo , H soil , H BA , and H res . The power law function has been shown to be well suited to describe the relationship between SR and area (Dengler, 2009;Dengler et al., 2020). Therefore, we formulated the CMP model as: where A k represents the area of mega-plot k, and β 0 to β 6 are the parameters to be estimated. Ŝ R k was then approximated using the following equation (Sellers & Shmueli, 2010;Shmueli et al., 2005): We used the country-specific models to predict SR along the reproducing the aggregation of plots located in an environmentally homogenous forest stand. We refer to these predicted curves as EH-adjusted rarefaction curves. In each country, observed and EHadjusted rarefaction curves were then compared (Figure 1d).

| Validation with downscaled datasets (Figure 1e-g)
As the investigated NFIs record the distance of each sampled tree to the plot center, we could artificially reduce the size of plots by removing outermost trees to fit any desired new radius. Downscaled datasets with reduced radii were created independently in each country using plot sizes ranging from 100m² to the original plot size (Figure 1e).
Different datasets sampling the exact same locations but with different plot sizes can show how plot size affects the shape of rarefaction curves. Additionally, they provide an opportunity to validate our method. As the goal is to obtain reliable SR comparisons between inventories using different plot sizes, applying our method to the downscaled and original datasets within a country should deliver the same EH-adjusted rarefaction curves for the method to be successful. The same previously described methodological steps -generating megaplots and modeling SR (Figure 1a-d) -were therefore applied independently to each downscaled dataset (Figure 1e). For each dataset of each country, 95% bootstrap prediction intervals were calculated on the predicted SR by re-fitting models over 5000 iterations where datasets were randomly re-sampled with replacement. In each country, EH-adjusted rarefaction curves extracted from the downscaled and full datasets were compared (Figure 1f,g). The removal of the effect of EH was considered successful when, contrary to the observed rarefaction curves, the EH-adjusted rarefaction curves from the different downscaled and full datasets did not significantly differ.

| From observed to EH-adjusted rarefaction curves
In all countries, we observed steeper sample-based rarefaction curves (SR increased faster along the area gradient) when plot size was smaller (dotted lines Figure 2). Estimates of the CMP models are presented in Table S1 in the Appendix. In all countries, the EH-adjusted rarefaction curves were less steep than the observed rarefaction curves (solid lines Figure 2). The difference between the curves of the downscaled and full datasets was markedly reduced in the EH-adjusted rarefaction curves compared to the original ones. In all countries, the EH-adjusted rarefaction curves from the various downscaled and full datasets did not significantly differ ( Figure 3). However, the EH-adjusted rarefaction curves of the different datasets of Switzerland and Norway were closer than the ones of Spain and Slovakia.

| Comparison of SR between countries
The observed sample-based rarefaction curve of Switzerland was the steepest of all countries, followed by Slovakia. That would indicate, without any consideration for differences in plot sizes, that Switzerland has the highest SR of all four countries for any given area ( Figure 4a). However, the EH-adjusted rarefaction curves of Switzerland and Slovakia were closely matching (Figure 4b). The observed rarefaction curve of Norway was steeper than that of Spain for small areas, but leveled off earlier so that both countries reached a similar SR around 2000 m² (Figure 4a). This pattern persisted in the EH-adjusted rarefaction curves (Figure 4b). At 500 m², observed rarefaction curves indicated similar SR in Slovakia and Norway, while the EH-adjusted rarefaction curves showed a higher SR in Slovakia than in Norway.

| DISCUSS ION
Getting a comprehensive picture of biodiversity -more particu-

| Accounting for EH to enable SR comparisons between inventories using different plot sizes
As expected, we found that EH-adjusted rarefaction curves were less steep than their corresponding observed rarefaction curves.
The EH-adjusted curves represent the relationship between SR and area assuming the sampled area is essentially environmentally homogeneous. The validation process showed that our method was successful in all countries, but performed best in Norway and Switzerland. This could indicate that the EH of Spain and Slovakia was not as well captured by the environmental variables we used as in the other countries. Additionally, species composition and F I G U R E 3 Validation: for each country, EH-adjusted rarefaction curves, along with the corresponding 95% bootstrap prediction intervals. This figure represents a zoomed-in version of the EH-adjusted rarefaction curves represented in Figure  2b. The SR axis varies between countries. Different colors represent different datasets (downscaled and full plot size). Colors are not representative of the same plot sizes across countries as their original plot sizes differ. Note that the step-like appearance of the prediction intervals relates to the fact that SR by definition is an integer as it represents count data F I G U R E 4 (a) Observed and (b) EHadjusted rarefaction curves of Norway, Slovakia, Spain, and Switzerland. The spacing between the points of each curve differs from one country to another as their plot sizes differ. EH-adjusted curves represent predictions from the models with no EH Mediterranean areas, there are also numerous companion tree species that cannot reach large diameters. This causes an underestimation of SR when using small area plots or concentric plot designs that apply a caliper threshold. Therefore, the Spanish NFI also measures total tree richness in 25-m-radius plots that in addition to sample trees also includes regeneration and trees below the caliper thresh-

| Limitations
Our method relies on several assumptions. First, we built rarefaction curves such that each aggregation step was independent of the previous ones. Plots were randomly selected from the entire plot pool, thus not building on previously aggregated plots. Traditionally, rarefaction curves are based on trajectories, where at each step a plot is randomly selected and added to the already aggregated ones from the previous step (Gotelli & Colwell, 2001). The intention for building such trajectories has often been to approximate the species-area relationship in one homogeneous ecosystem or area such as an agricultural field or a lake. Since plots in such well-defined and spatially constrained ecosystems are not independent, a trajectory-wise aggregation is appropriate. In our case, where even the closest NFI plots are located in different forest stands, we deemed the random aggregation more appropriate to reflect the independence of the aggregated plots.
Second, our approach relies on the assumption that EH is a key factor driving differences in species composition between plots (also termed species turnover or beta diversity). However, different environmental conditions between plots might not always translate into different species compositions. This needs to be considered when interpreting the country-specific degree to which our method adjusts rarefaction curves, and may explain why the proposed method works better in some countries than in others, for instance, as a result of forest management. Furthermore, our method assumes that the environmental variables used are indeed related to differences in SR between aggregated plots. The quality and resolution of the environmental data might not be sufficient to describe ecologically meaningful differences across plots. Additionally, other variables such as maximum temperature or summer precipitation could be better suited for certain regions or species groups. This aspect should be further investigated to better understand which variables best capture EH in each country, and how much each EH measure contributes to SR.
Third, in regions where the tree species composition has strongly been modified by forest management, e.g., by creating monospecific plantations, environmental variables might be meaningless in explaining differences in SR between aggregated plots. In other words, our method is expected to perform best in forests with native species mixes. Unfortunately, the forest naturalness level could hardly be accounted for by using variables related to forest management.
Indeed, management practices often target certain species and can both increase and decrease SR depending on the treatments applied.

| Additional confounding factors resulting from sampling design differences
Sampling design characteristics other than plot size might affect how many species are recorded in a plot. In NFIs, the DBH threshold from which trees are measured varies across countries. We artificially harmonized this threshold to match the highest one (12 cm).
This might have affected SR differently in each country, as growing conditions can greatly vary between regions and along environmental gradients. Consequently, our rather high DBH threshold may hardly be reached even by mature trees where environmental conditions are limiting tree growth. This certainly occurs in Spain, and possibly in Northern Norway or in the Swiss and Slovak mountainous areas where the climate is cold and growing seasons are short. Additionally, slowly growing species -even when able to grow beyond that threshold -are less likely to be detected. Instead of using the same DBH threshold across inventories, our approach could be tested with country-or region-specific DBH thresholds adapted to the prevalent growing conditions and species composition. Furthermore, our approach is applicable to inventories based on both fixed-area and concentric circles plots, but not to inventories using angle count sampling such as in Germany, as this design is not area-based.

| Broader implications
By allowing reliable SR comparisons between inventories using different plot sizes, our method could be used to compile data from several monitoring programs to report on large-scale biodiversity patterns. Such comparisons are relevant both within countries when several inventories using different plot sizes are in place, e.g., for different regions or ecosystems, as well as internationally, where differences in plot sizes are the rule rather than the exception. Our approach could enable data-driven, large-scale comparisons of SR, for example, across Europe. Such attempts would involve the following steps: (1) build rarefaction curves by aggregating plots for each inventory of interest, (2) perform inventory-specific CMP regressions to quantify EH effects, (3) predict SR with no EH along the area gradient, and (4) use these predictions to translate to a common plot size. Furthermore, our method could be used to provide private owners or land managers with a tool informing them on the number of species they can expect to find in their land, given its area and location. This tool could help planning and making decisions on the potential necessity to take actions aiming at increasing biodiversity.
Our approach focuses on SR, but could be applicable to any other plot-based count data having a non-linear relationship to area, such as the number of veteran trees, structural elements, or microhabitats. Furthermore, our method is not bound to NFIs, but could be applied to any other plot-based inventory or monitoring program.
Given the large amount of biodiversity datasets that have been put together over the last decades, our method is in line with the growing interest of the scientific community in big data.
Further steps building on our approach could involve investigating beta diversity, or other related biodiversity indicators such as functional or phylogenetical diversity. Our models based on the quantification of EH and its effect on rarefaction curves could also be further developed to investigate the effect of other environmental factors, explore further to which level these EH variables are affecting SR depending on the region or country, and compare their effects.
Consequently, large compiled and plot-size corrected datasets could enable unseen research, e.g., on the environmental determinants of SR across large areas, maybe even entire continents, including their potential trends over time. Our approach has therefore a high applicability and transferability that could be further pursued.