H

Systems. Research. Engineering.

A collection of projects I have built across systems, research, and engineering.

2025

  • Adaptive LoRA caching for LLM serving
    · report
    PEFT · LoRA · S-LoRA · Chameleon · LLM inference · Caching

    Adding a GPU side cache to maximise efficiency is multi-tenant LLM serving scenarios.

  • MatchOpt
    Kafka · Flink · Distributed Systems · Java

    The first open-source real-time matchmaking middleware EOMM.

  • LLMSecConfig Audit
    LLMs · Security · Kubernetes

    Security audit of LLM-based Kubernetes misconfiguration remediation.

  • DiTTo
    PyTorch · Fault-tolerance · LLM inference

    Improving latency of HuggingFace Search-and-Learn framework for test time compute scaling.

  • Reimplenting VAEs from scratch
    · report

    Analysis of the original VAE paper by Kingma and Welling with latent space representation and analysis.

Previous (non-exhaustive)

  • Image Splicing Detection
    Image Processing

    Exposing image slicing by detecting inconsistencies in local noise variances.

  • RedPlag
    Bag of words · TF-IDF · Django · Latent Semantic Analysis

    A web based plagiarism checker with user authentication.