Skip to content
All projects

2026 · Solo project · 3 min read

Customer Growth & Pricing Intelligence Platform

End-to-end synthetic fintech data product — dbt metrics, CUPED A/B experimentation, activation model, geo-lift referral analysis, pricing intelligence, FastAPI service, and a full GCP deployment path with BigQuery, Cloud Run, and Cloud Monitoring.

  • Python
  • dbt
  • DuckDB
  • BigQuery
  • Cloud Run
  • FastAPI
  • Streamlit
  • Marimo
  • scikit-learn
  • synthetic control
  • GitHub Actions
  • GCP

What I built

A synthetic fintech data product that connects customer analytics, experimentation, activation modelling, pricing intelligence, and cloud deployment into one end-to-end workflow — answering the questions a product or growth team would actually ask:

  • Which customers activate, retain, and become higher-value users?
  • Which onboarding treatment should be shipped, monitored, or iterated?
  • Are referral incentives incrementally acquiring customers, or just subsidising signups that would have happened anyway?
  • Which pricing or incentive offers are commercially attractive after guardrails?
  • Can scoring, monitoring, and API surfaces be operated as a small data product rather than a one-off notebook?

All data is synthetic. No customer data, internal bank data, or proprietary business metrics are used.

Analytics and modelling

dbt metrics layer

A trusted metrics layer in dbt covers D7 activation, W4 retention, feature adoption (Savings Pots, Salary Sorter, referrals), WAU, transaction frequency, CLV proxy, and experiment user metrics. Guardrail metrics track support contact rate, complaint load, fraud flags, crash rate, vulnerable-customer impact, and fair-value signals. 107 dbt tests pass on both local DuckDB and BigQuery.

A/B experimentation (personalised onboarding)

Full analysis: sample-ratio mismatch (SRM) check, CUPED variance reduction implemented from first principles, guardrail metrics, power analysis, heterogeneous treatment effects, and a PM-ready launch recommendation memo.

Regional referral incrementality (geo-lift)

Causal estimation of whether a referral incentive drove incremental signups — not just correlated ones. Methods: difference-in-differences with parallel trends tests, synthetic control, spillover checks, placebo regressions, and embedded ground-truth recovery to validate the estimator against known lift.

Activation decisioning model

Calibrated binary classifier for D7 activation with: train/test split, calibration curve, SHAP explainability, segment-level fairness checks, and a model card documenting customer-outcome guardrails.

Pricing intelligence

Offer economics workflow covering recommendation guardrails, scenario runs, and sensitivity analysis. Compares margin, expected conversion, support risk, and fairness guardrails before any rollout decision.

Production-style deployment

The project goes beyond local development — the GCP path has been exercised and documented:

  • Cloud Storage raw landing path loaded into BigQuery
  • BigQuery raw tables verified against export manifest; dbt graph run on BigQuery with 107 passing checks
  • Cloud Run Jobs deployed for activation scoring and score monitoring
  • Cloud Run API (private) deployed and smoke-tested via authenticated /health
  • Cloud Scheduler configured for daily scoring and monitoring runs
  • Cloud Monitoring alert policies created for job failures and API errors
  • Budget alert and Cloud Storage lifecycle policy configured for cost control

Delivery surfaces

  • Public Streamlit dashboard (Streamlit Community Cloud) for product health, experiments, pricing, and monitoring
  • FastAPI service with activation, churn, upsell, offer recommendation, and pricing scenario contracts
  • Operations runbook covering release gates, drift checks, realised-label calibration, rollback triggers, and GCP triage
  • Docker and Docker Compose for local full-stack reproduction
  • CI running lint, tests, deterministic report regeneration, and monitoring snapshot workflow

Honest limits

This is a synthetic portfolio case study. A real regulated-financial-services deployment would require stronger API protection (API Gateway, IAP, JWT), formal data governance with row/column-level controls, keyless CI/CD via GitHub OIDC, a model registry for reproducible training and shadow deployments, and formal Consumer Duty, model-risk, and audit controls before any live customer decisioning. These gaps are documented in the repo, not left implicit.