Selected work
Projects
A few representative builds. Filter by stack to narrow down.
- All
- A100
- FBANK
- HPC
- Hugging Face
- NumPy
- PySpark
- PyTorch
- Python
- Qwen2.5-7B
- RAG
- Slurm
- Spark MLlib
- Transformers
- Vector search
- scikit-learn
2026
Distributed data mining and ML pipelines on 1.9M-20M record datasets, run on the University of Sheffield Stanage HPC cluster. Web log mining, traffic prediction, HIGGS classification, MovieLens recommendations.
- PySpark
- Python
- Slurm
- HPC
- Spark MLlib