AI & machine learning

Our AI Focus delivers science-driven, practical solutions through three pillars:

ERP-GPT: Data + Knowledge for Users  A domain-grounded AI assistant that connects our soil data and expertise directly to farmers, land managers, and soil scientists—turning complex insights into clear, actionable guidance.

Physics-Informed, Multi-Modal ML  Models that combine wave physics with sensor, spatial, and field data for accuracy, robustness, and interpretability.

ML-Ready Global Soil Data  Clean, harmonised global soil datasets built for reliable, scalable machine learning and Retrieval Augmented Generation.

Together, these pillars create a unified, user-centred approach to next-generation soil intelligence.

ERP TEAM

Prof Tarje Nissen-Meyer

Dr Kuangdai Leng

Dr Matteo Bagagli

Dr Joe Collins

Deliverables

LUCAS-MEGA

The repository provides integrated sample–feature representations in both tabular and dictionary formats, along with metadata and associated asset files. It also includes intermediate outputs from the data standardization process, as well as the scripts and schema definitions used to construct LUCAS-MEGA.

SoilFormer

This repository contains the architecture, pretrained weights, and training scripts for SoilFormer. It enables reproduction of the representation learning experiments and supports further development on LUCAS-MEGA.

ERP-GPT-EU

This repository provides APIs, prompt templates, and related resources for ERP-GPT-EU, supporting tool-augmented interaction with the dataset through natural language.

research@earthroverprogram.org

SoilGPT:

We unify Soil Science and Seismology through generalised Masked Language Modelling (MLM) on multi-modal tokens, integrating data-driven learning with soil & wave physics.

Global Soil Data

We develop a multi-agent system that automatically processes and fuses the vast range of soil datasets available worldwide, covering Europe, Africa, North America, South America, and beyond. This system brings together multidisciplinary information—including physical, chemical, and biological properties; nutrients; soil functions; and contamination or threat indicators—to create a comprehensive global resource.

The result is an ML-ready corpus with a unified, sample-based structure, accessible through a consistent API. All data undergoes strict human sanity checks and is enriched with rich natural-language annotations, ensuring both technical reliability and clarity for downstream applications.

Global soil data visualization showing soil health and composition worldwide

Physics-informed, Multi-modal ML

We unify soil science and seismology through a generalised Masked Language Modelling (MLM) framework applied to multi-modal tokens, enabling models to learn jointly from soil measurements, waveforms, spatial data, and physics-based constraints. This approach integrates data-driven learning with soil and wave physics while capturing causal relationships—such as how soil texture influences organic carbon content—allowing the model to reason about how key parameters affect one another across diverse soil and geophysical settings.

Earth Rover Program GPT diagram showing workflow for soil data collection

ERP-GPT

ERP-GPT blends expert soil knowledge with global data to deliver grounded, context-aware guidance. Vector databases and a knowledge graph provide trusted domain insights, while global dataset APIs supply real-world soil and land information in-context. Through retrieval-augmented generation and expert-in-the-loop refinement, SoilGPT becomes a reliable assistant for farmers, land managers, and soil scientists.

SoilGPT-EU

Diagram illustrating the ERP-GPT system for soil monitoring and analysis

Field surveys

Seismic analysis & modelling

Soil science

Resistivity/GPR analysis

Sensor development

AI & machine learning

Applications

Field surveys

Seismic analysis & modelling

Soil science

Resistivity/GPR analysis

Sensor development

AI & machine learning

Applications

Field surveys

Seismic analysis & modelling

Soil science

Resistivity/GPR analysis

Sensor development

AI & machine learning

Applications

Get occasional updates on new work, insights and useful resources.

By subscribing you agree to receive occasional emails. You can unsubscribe at any time.

Get occasional updates on new work, insights and useful resources.

By subscribing you agree to receive occasional emails. You can unsubscribe at any time.