AI & machine learning
Our AI Focus delivers science-driven, practical solutions through three pillars:
ERP-GPT: Data + Knowledge for Users A domain-grounded AI assistant that connects our soil data and expertise directly to farmers, land managers, and soil scientists—turning complex insights into clear, actionable guidance.
Physics-Informed, Multi-Modal ML Models that combine wave physics with sensor, spatial, and field data for accuracy, robustness, and interpretability.
ML-Ready Global Soil Data Clean, harmonised global soil datasets built for reliable, scalable machine learning and Retrieval Augmented Generation.
Together, these pillars create a unified, user-centred approach to next-generation soil intelligence.
Global Soil Data
We develop a multi-agent system that automatically processes and fuses the vast range of soil datasets available worldwide, covering Europe, Africa, North America, South America, and beyond. This system brings together multidisciplinary information—including physical, chemical, and biological properties; nutrients; soil functions; and contamination or threat indicators—to create a comprehensive global resource.
The result is an ML-ready corpus with a unified, sample-based structure, accessible through a consistent API. All data undergoes strict human sanity checks and is enriched with rich natural-language annotations, ensuring both technical reliability and clarity for downstream applications.
Physics-informed, Multi-modal ML
We unify soil science and seismology through a generalised Masked Language Modelling (MLM) framework applied to multi-modal tokens, enabling models to learn jointly from soil measurements, waveforms, spatial data, and physics-based constraints. This approach integrates data-driven learning with soil and wave physics while capturing causal relationships—such as how soil texture influences organic carbon content—allowing the model to reason about how key parameters affect one another across diverse soil and geophysical settings.
ERP-GPT
ERP-GPT blends expert soil knowledge with global data to deliver grounded, context-aware guidance. Vector databases and a knowledge graph provide trusted domain insights, while global dataset APIs supply real-world soil and land information in-context. Through retrieval-augmented generation and expert-in-the-loop refinement, SoilGPT becomes a reliable assistant for farmers, land managers, and soil scientists.










