Skip to content

Selected work

Projects

A few representative builds. Filter by stack to narrow down.

  • 2026

    Zero-shot event extraction with Qwen2.5-7B-Instruct on MAVEN and WikiEvents. Compares unconstrained vs constrained-label prompting across trigger detection, type prediction, and argument extraction. A100 GPU inference via Hugging Face.

    • Python
    • Qwen2.5-7B
    • Hugging Face
    • PyTorch
    • A100
    View source
  • 2026

    Distributed data mining and ML pipelines on 1.9M-20M record datasets, run on the University of Sheffield Stanage HPC cluster. Web log mining, traffic prediction, HIGGS classification, MovieLens recommendations.

    • PySpark
    • Python
    • Slurm
    • HPC
    • Spark MLlib
    View source
  • 2026

    Retrieval-augmented question answering over a document corpus. Embedding store, retriever, generator, and an evaluation harness. Foundation for the security-flavoured RAG project.

    • Python
    • Transformers
    • Vector search
    • RAG
    View source
  • 2025

    Classical ML benchmark on FBANK speech features. Improved speed-classification accuracy from 79.2% to 86.6% via feature standardisation and kNN tuning. Compared kNN, Logistic Regression, Linear SVM, Random Forest with detailed error analysis.

    • Python
    • scikit-learn
    • NumPy
    • FBANK
    View source