Skip to content
All projects
[ 08 ]2026Solo project3 min read

MoveIn

Explainable where-to-live decision tool for England & Wales — nine official open-data signals rolled to MSOA grain (7,264 areas), served by a tested dbt + DuckDB engine through a public API and a website with a page for every area.

  • dbt
  • DuckDB
  • FastAPI
  • Fly.io
  • Next.js
  • Vercel
  • GitHub Actions
  • Python
9
open-data signals
7,264
MSOAs scored
189
data tests

What I built

MoveIn is an explainable decision-support tool for renters and movers. Given a household's income, budget, commute target, and risk tolerance, it ranks neighbourhoods (MSOA grain) as transparent trade-offs — not a price predictor and not a glossy listings site, but an honest decision layer over fragmented official UK data. It evolved from a Land Registry analytics-engineering project into a shipped product with a public API and a website.

The nine signals

Nine real national open-data signals per MSOA, across all 7,264 England & Wales areas:

  • Sale-market context — 4.99M Land Registry transactions (2021–2025)
  • Geography — ONSPD postcode→MSOA bridge (2.73M postcodes, 99.999% coverage)
  • Rent & affordability — ONS PIPR rent vs an income scenario
  • EPC energy — 23.5M England & Wales certificates → per-area median band
  • Crime — 17.1M police street-crimes as a rate (an indicator, never a "safe/unsafe" label)
  • Flood & planning — point-in-polygon over 2.7M postcode centroids
  • Amenity access — 437k OpenStreetMap amenities (supermarket, school, GP, park, station)

Each external source is fixture-default for fast, reproducible CI, with a real-data toggle for production builds.

Explainable scoring

A per-MSOA engine (rpt_neighbourhood_score) computes five component scores — affordability, safety, energy, flood, and convenience (0–100 percentile) — and a weighted overall score that re-ranks live when the user shifts the weights, with per-area confidence derived from data coverage and a plain-language "why this area" summary.

The product

  • A Next.js website on Vercel with a page for every scored area (~7k programmatic pages): set income / budget / priorities → a live-reranked shortlist, a per-area trade-off receipt, side-by-side compare, and sources & caveats on every recommendation.
  • A public FastAPI service on Fly.io (resolve / search / listing-check / meta over the decision marts) with OpenAPI docs at /docs — the same engine the website reads, open for anyone to query.

The spine

A dbt-core + DuckDB warehouse (the whole five-year warehouse fits in ~200 MB), 189 data tests plus a source-freshness check, and branch-protected GitHub Actions CI that runs unit tests, dbt build, and sqlfluff lint on every PR before anything reaches the API or website.

Honest limits

Indicators, never verdicts — crime and similar signals are surfaced as rates with caveats, not a "safe/unsafe" judgement. Door-to-door commute time is the one remaining signal on the roadmap (station proximity is already covered). The data refreshes on a cadence, not in real time, so a brand-new listing or a very recent planning decision may lag the official source.