
Student information
MSc thesis topic: Feature Space Coverage and Uncertainty in Spatial Biomass Prediction
Accurate estimates of environmental properties at subnational to global scale are essential for realizing climate goals, for example by monitoring the carbon balance. The greatest accuracy is obtained by combining reference data from environmental surveys such as forest inventories with maps. If the reference data has been acquired using standardized methodology and with probability sampling, this can be achieved by well-established statistical inference. However, the prevailing mix of surveyed reference data has incomplete coverage, has been measured by different methods and lacks a consistent sampling design. Therefore, model-based approaches (e.g., geostatistics & machine learning) are needed for estimating the target properties. The aim of this thesis is to assess the extent to which model-based predictions are supported by the reference data by so-called within domain determination and uncertainty assessment.
The above figure shows examples of within-domain determination; (a) sample points (red) taken from Fig. 2(h) in de Bruin et al. (2022); (b) isolation score computed by the Isolation Forest approach (Liu et al., 2012; Cortes, 2022); (c) spatial, cell-based domain delineation, where hexagon cells lacking sample points are greyed out; (d) outlier mapping by thresholding the isolation score shown in (b). Note that the within-domain area in (d) is much larger than that in (c).
To allow comparison of different samples, an existing map will be used for providing reference data.
Objectives
- Explore and compare methods for assessing to what extent sample data cover the feature space on which predictions are made.
- Assess whether model-based prediction uncertainty estimates apply to claimed within-domain regions..
Literature
- Cortes, D. 2022 isotree: Isolation-Based Outlier Detection, Comprehensive R Archive Network (CRAN).
- de Bruin, S., Brus, D.J., Heuvelink, G.B.M., van Ebbenhorst Tengbergen T., and Wadoux, A.M.J.C. 2022. Dealing with clustered samples for assessing map accuracy by cross-validation. Ecological Informatics, 69, and several papers citing this work.
- Liu, F.T., Ting, K.M. and Zhou, Z.H. 2012. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data 6(1).
- Wadoux, A.M.J.C., Heuvelink, G.B.M., de Bruin, S. and Brus, D.J., 2021. Spatial cross-validation is not the right way to evaluate map accuracy. Ecological Modelling, 457.
Requirements
- Spatial Modelling and Statistics (GRS30306)
- Interest in machine learning
Theme(s): Modelling & visualisation