Selected work
Projects
A few representative builds. Filter by stack to narrow down.
- All
- A100
- FBANK
- HPC
- Hugging Face
- NumPy
- PySpark
- PyTorch
- Python
- Qwen2.5-7B
- RAG
- Slurm
- Spark MLlib
- Transformers
- Vector search
- scikit-learn
2026
Zero-shot event extraction with Qwen2.5-7B-Instruct on MAVEN and WikiEvents. Compares unconstrained vs constrained-label prompting across trigger detection, type prediction, and argument extraction. A100 GPU inference via Hugging Face.
- Python
- Qwen2.5-7B
- Hugging Face
- PyTorch
- A100
2026
Distributed data mining and ML pipelines on 1.9M-20M record datasets, run on the University of Sheffield Stanage HPC cluster. Web log mining, traffic prediction, HIGGS classification, MovieLens recommendations.
- PySpark
- Python
- Slurm
- HPC
- Spark MLlib
2026
Retrieval-augmented question answering over a document corpus. Embedding store, retriever, generator, and an evaluation harness. Foundation for the security-flavoured RAG project.
- Python
- Transformers
- Vector search
- RAG
2025
Classical ML benchmark on FBANK speech features. Improved speed-classification accuracy from 79.2% to 86.6% via feature standardisation and kNN tuning. Compared kNN, Logistic Regression, Linear SVM, Random Forest with detailed error analysis.
- Python
- scikit-learn
- NumPy
- FBANK