Skip to content

Selected work

Projects

A few representative builds. Filter by stack to narrow down.

  • 2026

    Distributed data mining and ML pipelines on 1.9M-20M record datasets, run on the University of Sheffield Stanage HPC cluster. Web log mining, traffic prediction, HIGGS classification, MovieLens recommendations.

    • PySpark
    • Python
    • Slurm
    • HPC
    • Spark MLlib
    View source