Student information

MSc thesis topic: Data cubes in agriculture and biodiversity context

Data cubes are complex data representations enabling researchers to integrate and analyse data across multiple dimensions such as space, time, and thematic variables, providing a comprehensive framework for environmental monitoring and decision-making. Data cubes have significant role in spatial domain related big data infrastructures by enabling efficient analysis of large datasets. The structured format of data cubes allows creating complex queries and analytical tasks, which are essential for gaining actionable insights.

This thesis topic is proposed in the context of FAIRiCUBE project which core mission is to enable players from beyond classic Earth Observation (EO) domains to provide, access, process and share gridded data and algorithms in a FAIR and TRUSTable manner. FAIRiCUBE is an EU funded project with a consortium of 8 partners from Norway, Germany, Austria, Luxembourg, Spain, Italy and The Netherlands.

The project’s goal is to leverage power of Machine Learning (ML) operating on multi-thematic datacubes for a broader range of governance and research institutions from diverse fields, who are at present cannot easily access and utilize these potent resources.

The objective is the creation of a FAIRiCUBE Hub, a crosscutting platform and framework for data ingestion, provision, analyses, processing and dissemination, to unleash the potential of environmental, biodiversity and climate data through dedicated European data spaces.

WUR is one of the 8 project partners involved primarily on the Dutch use case: Biodiversity – agriculture nexus (one of 5 use cases).

The use case investigates the impact of agriculture activities on biodiversity within the agricultural landscape as the main environment. The main objective is to improve the knowledge about the correlation between biodiversity and different agricultural practices using a machine learning approach which is consistent across different regions. This would provide a step forward in making more precise estimates of e.g. biodiversity in a spatial context, by linking biodiversity with human activities in agricultural areas and related changes in the physical conditions (e.g., soil, groundwater, emissions etc.). Finally, it aims at increasing awareness about data cubes and AI in domain stakeholders involved in the smart agriculture and biodiversity fields.

In the use case, three main types of data are being used: biodiversity data, environmental data, and agricultural data. Each of these data categories is handled primarily within their individual processing flow where distinctive data cubes are generated. The flows are then ultimately merged using causal machine learning.

Relevance to research/projects at GRS or other groups

  • FAIRiCUBE is a research project in which solutions for working with (geospatial) datacubes and machine learning are searched for.

Objectives and Research questions

This project is ongoing and there are a few research questions that can be the base of the thesis. In cooperation with the student a resarch question fitting within the timeframe of the project and wihtin the interests of the student can be selected.

  • Comparison between data cube platforms EOXHub, Rasdaman, and other similar environments.
  • What is the business case for FAIRiCUBE Hub in context of ML applications? Is it viable?
  • Usage / comparison of EO and ML metadata standards (towards GeoAI)
  • Does FAIRiCUBE Hub help with Causal AI in agri-environmental use cases?

Requirements

  • Some geo, statistics, and (causal) ML prior knowledge
  • Python and Pytorch, causal ML packages (depending on research objective)

Literature and information

  • https://fairicube.nilu.no/
  • https://fairicube.readthedocs.io
  • https://eox.at/
  • https://www.rasdaman.com/

Expected reading list before starting the thesis research

  • Kaddour, Jean & Lynch, Aengus & Liu, Qi & Kusner, Matt & Silva, Ricardo. (2022). Causal Machine Learning: A Survey and Open Problems. 10.48550/arXiv.2206.15475.
  • Gonzalez Andrew, Chase Jonathan M. and O'Connor Mary I. 2023 A framework for the detection and attribution of biodiversity change Phil. Trans. R. Soc. B37820220182 http://doi.org/10.1098/rstb.2022.0182

Theme(s): Modelling & visualisation