Systems. Research. Engineering.

A collection of projects I have built across systems, research, and engineering.

Adaptive LoRA caching for LLM serving
· report
PEFT · LoRA · S-LoRA · Chameleon · LLM inference · Caching
Adding a GPU side cache to maximise efficiency is multi-tenant LLM serving scenarios.
MatchOpt
code · report
Kafka · Flink · Distributed Systems · Java
The first open-source real-time matchmaking middleware EOMM.
LLMSecConfig Audit
code · report
LLMs · Security · Kubernetes
Security audit of LLM-based Kubernetes misconfiguration remediation.
DiTTo
code · report
PyTorch · Fault-tolerance · LLM inference
Improving latency of HuggingFace Search-and-Learn framework for test time compute scaling.
Reimplenting VAEs from scratch
· report
Analysis of the original VAE paper by Kingma and Welling with latent space representation and analysis.

Image Splicing Detection
code · report
Image Processing
Exposing image slicing by detecting inconsistencies in local noise variances.
RedPlag
code
Bag of words · TF-IDF · Django · Latent Semantic Analysis
A web based plagiarism checker with user authentication.