cv
Basics
Name | Anushri Suresh |
Expertise | LLM Inference optimization, scaling test-time compute |
Work
-
2025.05 - Present Summer Research Assistant
Johns Hopkins University - Center for Language and Speech Processing
Supervisors: Prof. Daniel Khashabi, Prof. Eric Nalisnick
- Building a PyTorch and vLLM-based inference engine for budget-aware decoding on mathematical reasoning benchmarks, maintaining performance degradation within 5%
-
2025.02 - Present Research Assistant
Carnegie Mellon University - InfiniAI Lab
Supervisor: Prof. Beidi Chen
- Optimizing speculative decoding, Key-Value (KV) cache compression, and continuous batching within MagicDec, achieving up to 2.5× throughput for long-context LLMs (32K tokens, batch size 256) on A100 GPUs
-
2024.09 - 2025.01 Graduate Research Assistant
Johns Hopkins University - ARCADE Laboratory
Supervisor: Prof. Mathias Unberath
- Built a language-promptable digital twin using a multi-modal foundation model (FluoroSAM) for real-time segmentation, 3D reconstruction, and automatic collimation, reducing radiation exposure by 60%
- Engineered an LLM-driven interface to interpret complex verbal instructions for the Brainlab Loop-X robotic C-arm, allowing autonomous, hands-free surgical imaging with 84% end-to-end success in cadaveric trials
-
2022.08 - 2024.07 Senior Engineer - Machine Learning
Bosch Global Software Technologies
- Spearheaded a team of 6 to develop an AI-powered test case optimizer for the Automotive Electronics Division, reducing test runtime by 66% and increasing code coverage to 94%
- Delivered scalable chatbot automation for the Bosch Automation Platform, integrating Named Entity Recognition and ElasticSearch to improve search relevance, boosting customer satisfaction by 31% across 5,000+ enterprise users
-
2021.06 - 2021.10 Summer Research Fellow
University of Zürich - Artificial Intelligence and Machine Learning Group
Supervisor: Prof. Manuel Günther
- Boosted face recognition on low-quality surveillance footage by designing a Super-Resolution GAN, increasing accuracy by 37% on downsampled and 5% on full datasets for reliable identification in real-world security scenarios
Education
-
2024 - 2026 -
2018 - 2022 Bachelor of Technology
National Institute of Technology, Tiruchirappalli
Electronics and Communication Engineering
Skills
Programming Languages | |
Python | |
C/C++ | |
SQL | |
Bash |
Developer Tools | |
Git | |
Docker | |
Kubernetes | |
Linux | |
SLURM | |
Airflow |
ML & AI Frameworks | |
PyTorch | |
TensorFlow | |
Keras | |
vLLM | |
HuggingFace Transformers | |
CUDA | |
OpenCV |
Projects
-
Selective KV Quantization
Don't Drop It, Compress It - Reducing LLM KV cache memory usage
- Led a team of 4 to design and implement Selective KV Quantization for LLM inference, preserving sink/window tokens in full precision while quantizing older cache entries to int8, achieving 2× memory savings with minimal impact on perplexity (5.56) and ROUGE-L (0.2073 → 0.1709)
-
Rich Teacher Features for Efficient Single-Image Haze Removal
Lightweight haze removal pipeline via knowledge distillation
- Devised a lightweight haze removal pipeline via heterogeneous knowledge distillation with a novel feature affinity module, yielding a 15% PSNR gain and 20× model compression
Publications
Awards
- 2025
The Siemens Healthineers Best Paper Award
IPCAI 2025
- 2025
Best Project Award
NLP: Self-Supervised (Spring 2025), Johns Hopkins University
Languages
English | |
Fluent |
Hindi | |
Native |
Tamil | |
Native |
Kannada | |
Native |