Colloquium
Impact of training data characteristics on land cover mapping accuracy: An empirical study
by Linda Vester
Abstract
Supervised classification of remote sensing images is often used for predicting land cover. Training data play an important role in this process. Understanding the impact of training data on land cover prediction is essential. In this thesis research, land cover maps were created by training Random Forest models on different training datasets. The objective was to empirically study the impact of training data sample size, design, class balance and labeling errors on land cover classification accuracy through a case study in Andalusia. All four training data characteristics influenced the prediction accuracy. The sample size and labeling errors accounting for the similarity between classes had the highest impact on land cover classification accuracy and uncertainty. This thesis highlights the importance of considering training data aspects when classifying land cover maps through the use of supervised classification.
Supervised classification of remote sensing images is often used for predicting land cover. Training data play an important role in this process. Understanding the impact of training data on land cover prediction is essential. In this thesis research, land cover maps were created by training Random Forest models on different training datasets. The objective was to empirically study the impact of training data sample size, design, class balance and labeling errors on land cover classification accuracy through a case study in Andalusia. All four training data characteristics influenced the prediction accuracy. The sample size and labeling errors accounting for the similarity between classes had the highest impact on land cover classification accuracy and uncertainty. This thesis highlights the importance of considering training data aspects when classifying land cover maps through the use of supervised classification.