Student information

MSc thesis topic: Feature Space Coverage and Uncertainty in Spatial Biomass Prediction

Accurate estimates of forest biomass at subnational to global scale are essential for monitoring the carbon balance and to realise climate goals. The greatest accuracy is obtained by combining reference data from sources such as forest inventories with biomass maps. If the reference data has been acquired using standardized methodology and with probability sampling, this can be achieved by well-established statistical inference. However, the prevailing mix of biomass reference data has incomplete coverage, has been measured by different methods and lacks a consistent sampling design. Therefore, model-based approaches (e.g., geostatistics & machine learning) are needed for estimating forest biomass. The aim of this thesis is to assess the extent to which model-based forest biomass predictions are supported by the reference data by so-called within domain determination and uncertainty assessment.

The above figure shows examples of within-domain determination; (a) sample points (red) taken from Fig. 2(h) in de Bruin et al. (2022); (b) isolation score computed by the Isolation Forest approach (Liu et al., 2012; Cortes, 2022); (c) spatial, cell-based domain delineation, where hexagon cells lacking sample points are greyed out; (d) outlier mapping by thresholding the isolation score shown in (b). Note that the within-domain area in (d) is much larger than that in (c).
To allow comparing different samples, an existing map will be used for providing reference data; predictions will use several environmental features.

Objectives

  • Explore and compare methods for assessing to what extent sample data cover the feature space on which predictions are made.
  • Assess whether model-based prediction uncertainty estimates apply to within-domain regions.

Literature

Requirements

  • Having taken Spatial Modelling and Statistics (GRS30306)
  • Interest in machine learning

Theme(s): Modelling & visualisation