MoveIn
Explainable where-to-live decision tool for England & Wales — nine official open-data signals rolled to MSOA grain (7,264 areas), served by a tested dbt + DuckDB engine through a public API and a website with a page for every area.
- dbt
- DuckDB
- FastAPI
- Fly.io
- Next.js
- Vercel
- GitHub Actions
- Python
What I built
MoveIn is an explainable decision-support tool for renters and movers. Given a household's income, budget, commute target, and risk tolerance, it ranks neighbourhoods (MSOA grain) as transparent trade-offs — not a price predictor and not a glossy listings site, but an honest decision layer over fragmented official UK data. It evolved from a Land Registry analytics-engineering project into a shipped product with a public API and a website.
The nine signals
Nine real national open-data signals per MSOA, across all 7,264 England & Wales areas:
- Sale-market context — 4.99M Land Registry transactions (2021–2025)
- Geography — ONSPD postcode→MSOA bridge (2.73M postcodes, 99.999% coverage)
- Rent & affordability — ONS PIPR rent vs an income scenario
- EPC energy — 23.5M England & Wales certificates → per-area median band
- Crime — 17.1M police street-crimes as a rate (an indicator, never a "safe/unsafe" label)
- Flood & planning — point-in-polygon over 2.7M postcode centroids
- Amenity access — 437k OpenStreetMap amenities (supermarket, school, GP, park, station)
Each external source is fixture-default for fast, reproducible CI, with a real-data toggle for production builds.
Explainable scoring
A per-MSOA engine (rpt_neighbourhood_score) computes five component scores — affordability, safety, energy, flood, and convenience (0–100 percentile) — and a weighted overall score that re-ranks live when the user shifts the weights, with per-area confidence derived from data coverage and a plain-language "why this area" summary.
The product
- A Next.js website on Vercel with a page for every scored area (~7k programmatic pages): set income / budget / priorities → a live-reranked shortlist, a per-area trade-off receipt, side-by-side compare, and sources & caveats on every recommendation.
- A public FastAPI service on Fly.io (
resolve/search/listing-check/metaover the decision marts) with OpenAPI docs at/docs— the same engine the website reads, open for anyone to query.
The spine
A dbt-core + DuckDB warehouse (the whole five-year warehouse fits in ~200 MB), 189 data tests plus a source-freshness check, and branch-protected GitHub Actions CI that runs unit tests, dbt build, and sqlfluff lint on every PR before anything reaches the API or website.
Honest limits
Indicators, never verdicts — crime and similar signals are surfaced as rates with caveats, not a "safe/unsafe" judgement. Door-to-door commute time is the one remaining signal on the roadmap (station proximity is already covered). The data refreshes on a cadence, not in real time, so a brand-new listing or a very recent planning decision may lag the official source.