Authors: Iyán Teijido-Murias and Jukka Miettinen
Last autumn, we had a visiting scientist from the University of Oviedo staying here at VTT. This gave us a great opportunity to test some of the FCM tools in his study areas in northern Spain, and thereby provided valuable information on the applicability of the methods in new areas.
A recently published journal paper entitled “Forest Height and Volume Mapping in Northern Spain with Multi-Source Earth Observation Data: Method and Data Comparison” reports the findings of his analysis related to traditional machine learning approaches. The study area covered four administrative regions in northern Spain (Galicia, Asturias, Cantabria and the Basque Country). The region is characterized by a temperate climate with abundant rainfall well distributed throughout the year. As a result of these conditions, the region is the most productive forest area in Spain, allowing the use of fast-growing species, such as Eucalyptus globulus, with harvesting cycles of less than 15 years. This is why frequent updating of forest information is essential in the region and remote sensing approaches are in high demand.
The study aimed to evaluate and improve models for predicting forest variables in forest plantations by integrating optical (Sentinel-2) and radar (Sentinel-1, ALOS-2 PALSAR-2 and TanDEM-X) datasets, supported by climatic and terrain variables. Five popular machine learning algorithms were compared, namely k Nearest Neighbours (kNN), LightGBM, Random Forest, Multiple Linear Regression (MLR) and XGBoost. The findings show improvement in r2 from 0.24 when only Sentinel-2 data were used with Multiple Linear Regression to 0.49 when XGboost was used with multi-source EO data.
Perhaps the most significant result of the study, from the point of view of the FCM project, was that the combination of multi-source datasets, regardless of the model used, significantly enhanced model performance. This supports our earlier findings from the use case demonstration areas in different parts of Europe. Particularly the inclusion of TanDEM-X based forest height information was found to be valuable for height and volume prediction. However, it was also highlighted in the paper that mountainous terrains severely limit the usability of radar datasets. The coverage of radar datasets in mountainous areas is often reduced, leading to a compromise between the best achievable accuracy and wall-to-wall coverage of the outputs. Depending on use cases, users may opt for somewhat less accurate results in favour of full geographic coverage.
The newly published paper also serves as a baseline study for a subsequent UNet deep learning model tests which are currently ongoing. The FCM UNet method will be tested in the same study region to evaluate its practical usability and the improvements that can be achieved to the results.
Full citation of the article: Teijido-Murias, I., Antropov, O., López-Sánchez, C.A., Barrio-Anta, M. and Miettinen, J. (2025) Forest Height and Volume Mapping in Northern Spain with Multi-Source Earth Observation Data: Method and Data Comparison. Forests 16, 563. DOI: 10.3390/f16040563