cv

Basics

Name Anushri Suresh
Expertise LLM Inference optimization, scaling test-time compute

Work

  • 2025.05 - Present
    Summer Research Assistant
    Johns Hopkins University - Center for Language and Speech Processing
    Supervisors: Prof. Daniel Khashabi, Prof. Eric Nalisnick
    • Building a PyTorch and vLLM-based inference engine for budget-aware decoding on mathematical reasoning benchmarks, maintaining performance degradation within 5%
  • 2025.02 - Present
    Research Assistant
    Carnegie Mellon University - InfiniAI Lab
    Supervisor: Prof. Beidi Chen
    • Optimizing speculative decoding, Key-Value (KV) cache compression, and continuous batching within MagicDec, achieving up to 2.5× throughput for long-context LLMs (32K tokens, batch size 256) on A100 GPUs
  • 2024.09 - 2025.01
    Graduate Research Assistant
    Johns Hopkins University - ARCADE Laboratory
    Supervisor: Prof. Mathias Unberath
    • Built a language-promptable digital twin using a multi-modal foundation model (FluoroSAM) for real-time segmentation, 3D reconstruction, and automatic collimation, reducing radiation exposure by 60%
    • Engineered an LLM-driven interface to interpret complex verbal instructions for the Brainlab Loop-X robotic C-arm, allowing autonomous, hands-free surgical imaging with 84% end-to-end success in cadaveric trials
  • 2022.08 - 2024.07
    Senior Engineer - Machine Learning
    Bosch Global Software Technologies
    • Spearheaded a team of 6 to develop an AI-powered test case optimizer for the Automotive Electronics Division, reducing test runtime by 66% and increasing code coverage to 94%
    • Delivered scalable chatbot automation for the Bosch Automation Platform, integrating Named Entity Recognition and ElasticSearch to improve search relevance, boosting customer satisfaction by 31% across 5,000+ enterprise users
  • 2021.06 - 2021.10
    Summer Research Fellow
    University of Zürich - Artificial Intelligence and Machine Learning Group
    Supervisor: Prof. Manuel Günther
    • Boosted face recognition on low-quality surveillance footage by designing a Super-Resolution GAN, increasing accuracy by 37% on downsampled and 5% on full datasets for reliable identification in real-world security scenarios

Education

Skills

Programming Languages
Python
C/C++
SQL
Bash
Developer Tools
Git
Docker
Kubernetes
Linux
SLURM
Airflow
ML & AI Frameworks
PyTorch
TensorFlow
Keras
vLLM
HuggingFace Transformers
CUDA
OpenCV

Projects

  • Selective KV Quantization
    Don't Drop It, Compress It - Reducing LLM KV cache memory usage
    • Led a team of 4 to design and implement Selective KV Quantization for LLM inference, preserving sink/window tokens in full precision while quantizing older cache entries to int8, achieving 2× memory savings with minimal impact on perplexity (5.56) and ROUGE-L (0.2073 → 0.1709)
  • Rich Teacher Features for Efficient Single-Image Haze Removal
    Lightweight haze removal pipeline via knowledge distillation
    • Devised a lightweight haze removal pipeline via heterogeneous knowledge distillation with a novel feature affinity module, yielding a 15% PSNR gain and 20× model compression

Awards

Languages

English
Fluent
Hindi
Native
Tamil
Native
Kannada
Native