Systems. Research. Engineering.
A collection of projects I have built across systems, research, and engineering.
2025
- Adaptive LoRA caching for LLM serving· reportPEFT · LoRA · S-LoRA · Chameleon · LLM inference · Caching
Adding a GPU side cache to maximise efficiency is multi-tenant LLM serving scenarios.
- MatchOptKafka · Flink · Distributed Systems · Java
The first open-source real-time matchmaking middleware EOMM.
- LLMSecConfig AuditLLMs · Security · Kubernetes
Security audit of LLM-based Kubernetes misconfiguration remediation.
- DiTToPyTorch · Fault-tolerance · LLM inference
Improving latency of HuggingFace Search-and-Learn framework for test time compute scaling.
- Reimplenting VAEs from scratch· report
Analysis of the original VAE paper by Kingma and Welling with latent space representation and analysis.
Previous (non-exhaustive)
- Image Splicing DetectionImage Processing
Exposing image slicing by detecting inconsistencies in local noise variances.
- RedPlagBag of words · TF-IDF · Django · Latent Semantic Analysis
A web based plagiarism checker with user authentication.