Data-Driven Site Prioritisation for Ecological Intervention
This project identifies which brownfield sites in Greater Manchester pose the highest environmental risk and are best suited for nature-based restoration interventions like mycoforestry (using fungi to remediate contaminated soil).
Using satellite imagery, soil data, and terrain analysis, the model scores 1,585 registered brownfield sites based on their likelihood of spreading contamination to watercourses and groundwater.
Key Question: If you had limited resources to restore brownfield sites, which ones should you tackle first?
Answer: Moderately-sized (0.1–10 hectares), flat sites close to rivers — particularly in the Salford M5 area.
The highest-risk sites cluster in Salford M5, near the River Irwell. Three of the top 10 priority sites are located here, reflecting:
The analysis combines three environmental factors:
A predictive model was trained to identify which site characteristics best predict restoration suitability. The model found:
Site size is the strongest predictor (75% of model importance)
Very small sites (< 0.1 hectares) are impractical to restore. Very large sites (> 10 hectares) require phased interventions beyond the scope of typical mycoforestry projects. The "sweet spot" is 0.1–10 hectares.
Terrain flatness contributes 13% — flat sites are more likely former industrial land, making them both riskier and more suitable for intervention.
Water and soil factors have minimal additional predictive power once size and terrain are accounted for.
This analysis identifies contamination risk, not confirmed contamination.
Sites scored as "high-risk" should undergo soil testing before restoration work begins. The model flags sites for further investigation — it does not replace on-the-ground assessment.
The machine learning model uses synthetic data (rule-based assumptions about restoration suitability) rather than real restoration outcomes. With access to historical records tracking which sites were successfully restored, the model could be retrained to predict actual success rates.
Economic feasibility is not modelled. A high-risk site might be impractical to restore due to land ownership issues, access constraints, or prohibitive remediation costs.
To extend this work into operational decision-making tools, the following would be valuable:
The full dataset, interactive map, and analysis code are available:
Open to collaboration on environmental data science projects and actively seeking opportunities in geospatial analysis and ecological restoration