Abstract

The availability of high-resolution satellite imagery has boosted the modelling of tropical forest attributes based on texture metrics derived from grey-level co-occurrence matrices (GLCMs). This procedure has shown that GLCM metrics are good predictors of vegetation attributes. Nonetheless, the procedure is also sensitive to the scale of analysis (image resolution and plot size). This study aimed to analyse the effect of spatial scale on the modelling of forest attributes, and to provide some ecological insight into such effect. Nineteen 32 × 32 m sampling plots were used to quantify forest structure (basal area: BA; mean height: H; standard deviation of height, HSD; density, D; and aboveground biomass, AGB). The 19 plots were subdivided into four 16 × 16 m, one of which was subdivided into four 8 × 8 m plots. To match this design, 12 GLCM metrics were calculated from a GeoEye-1 image (pixel size ≤ 2 m) using a 5-, 9-, and 21-pixel window from the R, NIR, NDVI, and EVI bands. For each of the windows, we modelled the five structural variables as linear combinations of the 12 metrics through linear models. The modelling potential ranged from high ( = 0.70) to low (0.11). H was the best-predicted attribute; this occurred at the smallest scale, with increasing scales producing lower values. The second best-predicted attribute was HSD, which peaked at the intermediate scale. D and AGB displayed a similar pattern. BA was the only attribute best predicted at the largest scale. Thus, in predicting tropical forest attributes from GLCM-derived texture metrics, the spatial scale to be used should reflect the spatial scale at which ecological processes occur. Therefore, understanding how ecological processes express themselves in a remotely sensed image becomes a critical task.

1. Introduction

Tropical forests are among the world’s most important ecosystems [1]; they house over half of Earth’s biota [2] and supply large flows of ecosystem services on which billions of people rely [36]. Accurate estimates of tropical forest attributes at broad geographic scales are crucial to support conservation strategies, evaluate deforestation and degradation, and particularly, monitor the loss and gain of forest biomass [4, 5, 7, 8]. Achieving such monitoring capacity in an economically viable manner requires integrating remote sensing technologies with field data into robust and accurate modelling procedures [911].

Over the last two decades, the availability of high- and very high-resolution satellite imagery, along with the growing capacity to process and analyse large volumes of data, has boosted the modelling of forest attributes based on texture metrics [12, 13], in particular, those derived from grey-level co-occurrence matrices [14]. GLCMs contain the probabilities of the co-occurrence of pixel values for pairs of pixels in a given direction and distance. Thus, GLCM-derived metrics provide textural information about the spatial variation and arrangement of the pixels in an image.

To date, many studies from different regions around the world have shown that such spatial variability is directly related to the heterogeneity of ecosystems, indicating that GLCM-derived metrics are promising predictors of different vegetation attributes [1526].

Also, these studies have led to several important conclusions. For one, the relationship between image texture and forest attributes varies depending on the measured variable. For example, the best-predicted forest attributes using image texture include tree density [15, 19, 24], biomass-related measures (bole volume; [18]), basal area [21], DBH variation [27], and even diversity measures [25]. In addition, these results vary significantly in the goodness-of-fit obtained by the predictive models (0.2 <  < 0.98). Some studies suggest that the variation present in a particular attribute can help explain the modelling potential using texture metrics [24, 28], while others suggest that these variations respond to the type of vegetation and attributes being studied [29, 30].

In spite of these promising results, there is still a third major conclusion derived from these studies that calls for a reflection: this procedure appears to be highly sensitive to the scale of analysis, both in image resolution and field-plot size [21, 24, 3032]. This situation hinders the possibility of building general models based solely on single-scale studies.

From an ecological perspective, although the incorporation of scale in the understanding of biological processes has been recognized as a critical element of ecology’s conceptual framework [33, 34], it is not until recently that its effects have been analysed [35]. Scale is likely to have a significant impact because no single process occurs at all scales; hence, the forest attributes derived from such processes may be expected to be best described/predicted at particular scales [36, 37]. Identifying the proper scale of analysis should be one of the major goals for spatial ecology.

Comprehending scale properly has also been acknowledged as a major challenge since the early years of remote sensing [38]. For this discipline, the scale represents the window of perception and establishes the limits within which a phenomenon must be perceived and quantified to understand it [39]. A major goal for any remote sensing scientist would be to develop and validate models that identify the scale at which a phenomenon occurs [40].

Attempts to develop such models for GLCM metrics based on a random sampling procedure have yielded satisfactory results [29, 30, 41]. Other approaches have used multiscale GLCM metrics to increase the performance of these textures for particular classification tasks [42, 43]. Yet, how scale affects the relationship between texture and different forest attributes remains an open question for tropical ecosystems.

In this context, our main objective was to analyse the effect of using three different spatial scales on the modelling of forest attributes. We hypothesized that particular attributes would be better predicted at particular window sizes because the underlying processes driving them occur at different spatial scales. Due to the current importance of generating robust models to predict biomass losses or gains in tropical forests, the study focused on tree basal area, mean height, standard deviation of height, density, and aboveground biomass, as all of them are key variables widely used in biomass estimation. Finally, we discuss potential ecological explanations on the relations between image metrics and forest structure established by our results.

2. Materials and Methods

2.1. Study Site

The study was conducted on the Pacific coastal plain of the Isthmus of Tehuantepec, Oaxaca State, southern Mexico (16° 38′ 37.19′−16° 41′ 14.94′N; 94° 59′ 31.94′−95° 2′ 59.59′W; Figure 1). This region has been extensively studied for more than 20 years to understand successional changes in the forest community [4446]. The regional dominant vegetation type is tropical dry forest, which has been reported as being highly diverse and heterogeneous in both structural and diversity attributes [47, 48]. The site has an annual mean temperature of 27.6 °C and a very pronounced seasonality between the dry (November-May) and rainy (June-October) seasons, with an average annual precipitation of 903 mm for the 1950−2014 period, according to CLICOM data [49].

2.2. Forest Attributes

In nineteen 32 × 32 m (1024 m2) sampling plots (Figure 1), selected to cover the slope variation (between 0° and 30°) where TDF establishes in a phyllite substrate [50, 51], forest structure was quantified between July 2012 and November 2013; these plots added to 1.95 ha in total. In order to test the effect of spatial scale on the modelling capabilities of image texture, a nested vegetation sampling design was used. Therefore, the 32 × 32 m sampling plots were subdivided into four 16 × 16 m plots (256 m2), one of which was further subdivided into four 8 × 8 m plots (64 m2). For each 32 × 32 m sampling plot, two subplots (both 16 × 16 and 8 × 8 m) that shared a single vertex were selected at random. Within each of these plots, every individual having a diameter at breast height (DBH; 1.30 m) ≥ 5 cm was measured for its height and DBH. All cacti, lianas, and trees that met these criteria were included in the sampling. The centre of each plot and its corresponding nested plots were recorded with a GPS (4–6 m accuracy).

For each plot, five community structural attributes were calculated: basal area (BA, m2/ha), mean height of all plants included in the sample (H, m), standard deviation of height (HSD, m), number of individual plants per unit area, i.e., density (D, ind./ha), and aboveground biomass (AGB, Mg/ha). BA (m2/ha) was calculated as the sum of the areas of all stems (approximated as circles using the measured DBHs) of all trees. In seven of the 19 plots, species identity was also obtained. We consulted the global wood density database [52, 53] to obtain each species’ wood density; however, only 64 of the 79 identified species were found in this database. For the species where there was no available wood density (i.e., due to a lack of species determination or lack of information in the database), an average wood density from the available information was used to calculate its AGB (following recommendations of the authors in [54]). For the rest of the species for which a wood density value was available, it was used to calculate each individual’s AGB. BA, D, and AGB were extrapolated to 1 ha to make them comparable with other studies. AGB was obtained using the BIOMASS package [55] in R 3.6.3 [56].

2.3. Remote Sensing Analysis
2.3.1. Image

A GeoEye-1 multispectral image (MS), with a pixel resolution of 1.7 m (planimetric accuracy of 3 m), acquired on November 11, 2012, was used for extracting image statistical and texture metrics. The image was orthorectified using a digital elevation model generated by the Mexican National Institute of Statistics and Geography (INEGI) with a 15 m spatial resolution, and was atmospherically corrected using the quick atmospheric correction (QUAC) algorithm [57]. The near-infrared (NIR) and red (R) bands, as well as the normalised difference vegetation index (NDVI) and the enhanced vegetation index (EVI) were used as predictors due to their proven capabilities of modelling different forest attributes [20, 21, 24, 26, 27, 58, 59]. The two vegetation indices were calculated as follows [14]:where B stands for the blue reflectance band [14, 60]. All the steps of the image processing were conducted in ENVI 4.7 + IDL.

2.3.2. Image Texture Metrics

Two sets of metrics were calculated from the image: statistical (first-order) and texture (second-order) metrics. The first set consisted of a statistical summary of the pixel values found inside a particular window: mean, variance, range, and skewness (Table S1). In turn, texture metrics characterise the spatial variation and arrangement of the pixels in a particular window. We followed the grey-level co-occurrence matrices (GLCMs) approach [61]; to this end, we calculated the texture metrics using a spatial distance of one pixel (offset) in four directions (0°, 45°, 90°, and 135°), transforming the pixel values into 64 levels of grey. Then, the texture measurements in all directions were averaged to obtain a single direction-independent texture value. The texture metrics can be grouped into three types of variables: (1) those describing the degree of contrast between pixels (homogeneity, contrast, and dissimilarity), (2) those that characterise the regularity in the pixels within a window (entropy and angular second moment), and (3) the statistical variables derived from the GLCM (mean, variance, and correlation). The twelve metrics (four statistical and eight texture variables) were calculated for the R and the NIR bands, as well as for the two vegetation indices (NDVI and EVI).

We used a moving-window approach to calculate image texture, with squared window sizes of 5, 9, and 21 pixels. Window sizes were selected as the nearest odd window size equivalent to each of the sampling-unit extent. Thus, the 5-pixel window was the closest to the 8 × 8 m plots, the 9-pixel window to the 16 × 16 m plots and, finally, the 21-pixel window to the 32 × 32 m plots. Finally, the central pixel value of each window was extracted from each of the 48 texture layers (four layers and 12 variables). The entire procedure was programmed in the ENVI + IDL environment.

2.4. Modelling Procedure

For each of the three window sizes, we modelled the five structural variables as linear combinations of the 12 image metrics through linear (mixed) models [21, 62]. Structural variables were log-transformed to achieve normality; image metric variables were standardised to ease model fitting [63]. We considered all possible linear models involving up to five parameters, including quadratic relationships and interactions; given the limited sample size, we restricted the number of parameters to five to prevent model overparameterization. This procedure required the fitting of ∼2 × 106 models. From these models, and to avoid collinearity, we excluded those models whose explanatory variables (image metrics) displayed an absolute correlation value of > 0.7 [64]. Given the nested structure of the sampling design, several 8 × 8 and 16 × 16 m plots associated with the 5- and 9-pixel windows, respectively, were nested within a particular 32 × 32 m plot; thus, models for these windows required the inclusion of a random effect associated with the 32 × 32 m plot to which the plots pertained.

Model selection was performed through the sample-corrected Akaike information criterion [65]. We considered the best-supported model the one with the smallest AICc. Due to the large number of fitted models and the limited sample size, we needed to exclude the possibility that the best-supported model described a random relationship between image metrics and the modelled forest attribute. We did this by calculating the difference between the value at the 5% quantile of the AICc distribution of all models considered and the AICc of the best model, i.e., we calculated ; a Δq > 2 would exclude this possibility.

To make results comparable between forest attributes and window sizes, we reported the Pearson’s coefficient of determination () as a measure of goodness-of-fit for those models that did not include the plot random effect, and Nakagawa’s [66] for those that did include it. Since both s measure the variance explained by the model [66], we will not distinguish among them throughout the remainder of the text.

Modelling was performed in R [56] using the lME4 [67], MuMIn [68], and performance [69] packages.

3. Results

A statistical summary of the five forest structure attributes quantified at three scales is presented in Table 1. Noteworthily, forest attributes varied among plots, but its magnitude was different when using distinct sampling-unit extents; for instance, while forest height showed minor variation according to its coefficient of variation, AGB had a 42% change between the 64 and 1024 m2 plots. In addition, the interplot variation of all forest attributes was higher in the smallest sampling units, while lower in the largest ones, but fell within the region’s variation [50, 70]. A similar pattern was predominantly observed in the variation of texture metrics calculated at different scales (Table S2).

According to their goodness-of-fit (; Figure 2), the modelling potential of image metrics to describe the five structural attributes ranged from high (0.70) to low (0.11). Thereby, mean plant height (H) was the best-modelled attribute in terms of the of its best-supported model. For this attribute, best fits were obtained at small window sizes, with increasing sizes producing lower values. All other structural variables displayed values below 0.5, making them harder to be described by image metrics.

Variation in plant height, as measured by HSD, was better predicted at the 9-pixel window, with a lower value compared with H; tree density (D) displayed a similar pattern. AGB was the attribute most poorly modelled among the ones considered; again, the 9-pixel window was the best spatial scale to model it. Finally, BA was best modelled through a 21-pixel window, followed by the 9-pixel one.

The analysis of the recurrence of the image metrics in the models that best describe the structural attributes is presented in Figure 3, and the full description of these models can be found in Table 2. For all attributes, the best models established relationships that would not have been expected to occur by chance (Table S3). Concerning the best predictors of the structural variables, EVI dissimilarity was the most frequent predictor, occurring in six out of 15 forest attribute/scale combinations. This predictor was followed by NIR homogeneity (5 combinations) and R means (5) (Figure 3; Table 2). Among the metrics, the most frequent in the best-supported models was entropy and the most frequent band was the EVI; notably, NDVI was only present in two models.

4. Discussion

Scale determines our perception of nature [71]. Thus, it is important to consider it when we aim at modelling the relation between a remotely-sensed variable and a field-acquired one. Based on our results, in the following sections, we discuss the effect of changing spatial scale on the modelling of tropical forest attributes from texture metrics, a recurrent question in spatial ecology, and provide some ecological insight into this issue.

4.1. The Effect of Changing Scale

The first major conclusion that can be drawn from our results is that tropical forest attributes cannot be best predicted at a single spatial scale; rather, different scales should be used for different attributes. This is relevant because it entails that, when setting a study that aims at predicting different forest attributes from remote sensing, it is necessary to identify the proper scale both for field measurements and texture window that best predicts these attributes [72, 73]. However, there is no consensus on the proper plot-window size for each variable, since this variation seems to respond to differences in the type of forest being studied, the resolution of the images used, and the forest attributes being quantified [29, 30, 74]. For instance, our results showed that mean tree height prediction peaks at small plot-window sizes, whereas basal area is best predicted at large plot-window sizes. This is in agreement with Kayitakire et al. [15], who found that a small window size better predicted tree height in a low-diversity spruce forest. In contrast, Ozkan and Demirel [30] detected an inverse pattern between these two variables in monospecific temperate forest stands. Ultimately, information collected at one scale for a given ecosystem may be totally inappropriate for addressing questions at another scale or a different ecosystem [75, 76], thus reinforcing the need for multiscalar sampling designs capable of capturing forest community variation on its different attributes [77]. This conclusion is particularly important considering the urgent need for accurate estimates of tropical forest attributes at broad geographic scales to support conservation strategies, evaluate deforestation and degradation, and monitor the variations of forest biomass [7, 8].

Forest attributes whose calculation depends on other measured variables deserve particular attention; for such attributes, the panorama is even more complex, because each of its constituent variables could peak in its prediction at a certain scale that differs from the peaking scale of other attributes. In our case, this situation is illustrated by AGB: this attribute, which peaks at the medium window size, is computed from tree height (peaking at the small scale) and basal area (peaking at the large scale). This result is in agreement with the findings of Castillo-Santiago et al. [18] in a tropical rainforest, but contrasts with other studies performed in a temperate forest where AGB was better described by texture metrics derived from the smallest window size [26], and a tropical dry forest where AGB was best predicted at larger scales using LiDAR data [78]. A plausible explanation for this apparent contradiction may be related to the fact that image texture is less prone to registration-related error compared with metrics that are not spatially averaged within a window. Also, this explanation is strengthened by the fact that the autocorrelation of AGB sampled in small plots (10 × 10 m) is not significant [79, 80]. In addition, the difference in the appropriate scale for modelling AGB using multispectral and LiDAR data can also be explained by the very different remote signals obtained by each sensor. However, a comparison made between the modelling potential of LiDAR and multispectral texture revealed that the modelling potential of stand-volume from texture and LiDAR metrics did not vary dramatically among different window sizes [30].

Finally, it is worth noting that forest attributes showed the highest interplot variation in the smallest subplots and the lowest variation in the largest plots (Table 1). This pattern was exactly the same when calculating the mean standard deviation for each plot size using Monte Carlo simulations (data not shown, Rejou-Mechain et al. [55]). This was expected since the smallest plots were expected to be more sensitive to the data from a single tree, due to its lower number of observations; thus, it can be interpreted that the data in the smallest plots had more noise. However, the scale at which tree height was best predicted was the smallest one, and certainly, the relationships present in this model were not obtained by chance (Table S3). Future studies should be directed toward clarifying the effect of noise on the predictive potential of similar models, particularly for the smallest scales.

4.2. Ecological Insights into the Relationship between Scale, Image Texture, and Forest Attributes

The reason why particular forest metrics are best modelled at some window sizes but not others, and by particular texture metrics, could depend on the specific aspect of the vegetation that these metrics are meant to reflect and how they vary across scales. The relation between GLCM-derived image texture and forest attributes has been repeatedly examined for several ecosystems [1525]; however, to our knowledge, the ecological interpretation of how this relation comes to exist has been largely overlooked. This may be due to the original use of texture in a nonecological context [61], and its transfer to ecology as a tool for prediction rather than as a means of explanation. Thus, in this section, we explore questions that may shed light on this profound ecological matter: Why is a particular scale appropriate for a certain forest attribute? What is the ecological interpretation of each texture metric? and How does it translate into a forest attribute?.

Forest attributes reflect the ecological processes of individual growth (H, HSD, and BA), biomass accumulation (BA, AGB), and spatial structuring of individuals (D). Thus, the spatial scales at which these processes occur and are calculated should determine the scale at which forest attributes can be best described/predicted [35, 40]. For instance, mean tree height (H) reflects the height of a regular tree in the plot, thus, it is not surprising that the best scale that models it is the smallest window. Alternatively, the basal area is the sum of all tree trunk diameters in a plot and thus, it was best modelled at the largest scale.

Among the five measured forest attributes, mean tree height (H) was the best predicted, and this occurred at the smallest scale (5-pixel window) by the entropy of the near-infrared band (NIR entropyt) and the statistical mean of the red band (R means; Table 2), being the former the most important, according to its coefficient value. Tree height varies largely within a plot as indicated by its standard deviation (HSD; Table 1). This variation is known to produce harsh shadows within an image that translate into a high contrast. This high contrast reduces the probability of finding neighbouring pixels with the same tone, which reduces entropyt ([61, 81, 82]; see the formula in Table S1). In turn, R means negatively relates to H, probably because a plot with large trees will display a high photosynthetic activity, thus absorbing a significant fraction of the red spectrum [83].

The second best-predicted forest attribute was HSD, which describes the variation in height among trees in a plot. This variable peaked at the intermediate scale (9-pixel window) and was predicted by NIR correlationt and EVI homogeneityt, the latter being the most important variable. In sites where trees exhibit similar heights (i.e., HSD is low), it is common to observe gaps in the canopy, which typically have low species richness [84]. In such sites, NIR correlationt will be low because areas with high photosynthetic activity will be intermingled with bare-soil areas. On the contrary, such an open canopy decreases EVI homogeneity.

Density also peaked at the intermediate scale and was best predicted by NIR entropyt and NDVI data.ranges, the former being the most important According to its formula (Table S1), a low NIR entropyt means that the probability of finding neighbouring pixels with the same tone is low. As stated above, this is associated with sites with gaps in the canopy. This type of forest is characterised by a large number of small individuals and a high NDVI data.ranges.

Basal area (BA) was the only attribute best predicted at the largest scale (21-pixel window) by, in order of higher to lower importance, NIR ASMt, contrastt, and skewnesss. Higher BA values correspond to plots with mostly large trees. Such plots also display high and homogeneous photosynthetic activity, which corresponds to a low probability of finding two neighbouring pixels with the same value (i.e., low NIR ASMt), a low contrast in NIR values (i.e., low NIR contrastt), and a (negative) skewness towards high NIR values.

Finally, aboveground biomass (AGB), a key variable in the monitoring of tropical forest carbon stocks [85], peaked at the intermediate scale (9-pixel window) and was best predicted by EVI contrast. In the study site, a high EVI contrast in texture is associated with a well-conserved community in which large trees dominate [21, 44]. Nonetheless, this being a dry forest, these large trees form a discontinuous canopy, which translates into a large EVI spatial variation.

5. Conclusions

In predicting tropical forest attributes from GLCM-derived texture metrics, no single scale is most useful; thus, one should identify the proper scale for each attribute if we aim at predicting tropical forest attributes at broad geographic scales. To achieve this goal, a recommendation for future studies would be to either perform a pilot study aimed at detecting which scale might be the best to describe the forest attributes or to follow a nested plots design such as the one in this study, to test the modelling potential at different scales. This would be particularly required given the enormous differences existing in tree size and forest development among regions exposed to different environmental conditions. In this regard, the different scales should be clearly identified as plot size, pixel resolution, and window size to calculate image textures.

A further relevant conclusion is that image texture can be interpreted from an ecological viewpoint. Although ours is a small piece in a big puzzle, this is a necessary task if we are to contribute to the understanding of how ecological processes express themselves in a remotely sensed image.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We are grateful to Eunice Romero Pérez, Rodrigo Muñoz Avilés, Jaqueline Noguez Lugo, Iván Pomonarev Potchidjanov, Eduardo García Molina, Abraham Bernal Martínez, Yanus Dechnik Vázquez, Axel Maldonado Romo, and Eduardo García Olivos for their support with fieldwork. We warmly thank the people of the Nizanda, Carrasquedo, and El Zapote communities for their hospitality and permission to work in their lands, and particularly, Ulises Herrera Martínez and Julio Antonio Carrasquedo Hernández for their assistance during fieldwork. This work was supported by the National Council of Science and Technology (CONACYT), Mexico (Grant no. CB-2009-01-128136). Open Access funding was enabled and organized by UNAM 2023.

Supplementary Materials

Table S1: image metrics extracted for each one of the bands used. Table S2: mean and coefficient of variation (CV) values of the image texture variables according to window size. Table S3: difference (Δq) between the 5% quantile of the AICc distribution of all models fitted and the AICc value of the best-supported model for each forest attribute variable and window size. (Supplementary Materials)