Cheng-Yuan (Ross) King
Machine Learning Engineer · Data Scientist
MSc Artificial Intelligence at the University of Sheffield. Applied ML, NLP, and scalable AI systems. Currently focused on production-grade RAG systems, LLM evaluation, and scalable distributed pipelines on HPC.
Selected work
Things I've built
A few projects that show how I approach distributed ML, LLM evaluation, and rigorous benchmarking. Full case studies on the projects page.
2026
Zero-shot event extraction with Qwen2.5-7B-Instruct on MAVEN and WikiEvents. Compares unconstrained vs constrained-label prompting across trigger detection, type prediction, and argument extraction. A100 GPU inference via Hugging Face.
- Python
- Qwen2.5-7B
- Hugging Face
- PyTorch
- A100
2026
Distributed data mining and ML pipelines on 1.9M-20M record datasets, run on the University of Sheffield Stanage HPC cluster. Web log mining, traffic prediction, HIGGS classification, MovieLens recommendations.
- PySpark
- Python
- Slurm
- HPC
- Spark MLlib
2026
Retrieval-augmented question answering over a document corpus. Embedding store, retriever, generator, and an evaluation harness. Foundation for the security-flavoured RAG project.
- Python
- Transformers
- Vector search
- RAG
Toolbox
What I work with
Pragmatic stack — pick the right tool, ship, measure, iterate.
- Machine learning
- PyTorch
- scikit-learn
- Model evaluation
- Cross-validation
- Hyperparameter tuning
- NLP & LLMs
- Hugging Face Transformers
- Qwen / Llama
- RAG
- Prompt engineering
- Event extraction
- Document QA
- Data & analysis
- Python
- pandas
- NumPy
- SQL
- Jupyter
- Statistical analysis
- Scalable & systems
- PySpark
- Slurm / HPC
- GPU computing
- CUDA
- Docker
- Git
- Languages
- English
- Mandarin (native)
- Japanese (JLPT N1)