Applied ML @ UMD LLM Eval + Retrieval FastAPI / Cloud

Building ML + LLM systems that ship.

I’m an Applied ML engineer focused on retrieval, evaluation, and agentic workflows. I care about measurable quality (offline + human eval), safety, and reliability.

Euler AI Founding Engineer Β· ML Systems
Hackathons 2nd @ DeepMind ($5K) Β· Ironsite Β· Pear VC
LLM Evaluation G-Eval Β· LLM-as-a-judge Β· guardrails

About

What I do
  • Build retrieval and LLM pipelines with evaluation loops (offline + human / judge).
  • Ship backend APIs (FastAPI), containerized deployments, and demos that run.
  • Focus on reliability: safety checks, regression tests, and measurable wins.
What I’m looking for
  • Applied ML / ML Systems internship roles.
  • Search, recommendations, retrieval, LLM tooling, evaluation.
  • Teams that care about metrics, iteration, and engineering quality.

Experience

Euler AI β€” Founding ML/Software Engineer Β· Mar – Jul 2025
  • Built e-commerce conversational AI agent end-to-end: intent classifiers, query reconstruction, reranker for search relevance, PII guardrails + prompt injection prevention.
  • Architected orchestration pipeline: embedding models for semantic catalog matching, knowledge graph with persistent memory, multi-model ensemble routing via FastAPI microservices.
  • Implemented LLM-as-a-judge evaluation (G-Eval) measuring coherence, relevance, and faithfulness in batch, with end-to-end agent tracing for throughput, latency, and cost.
  • Contributed to open-source Mem0 and EmbedChain: added persistent memory for agentic multi-turn workflows.

Featured Projects

AI Resume Agent β€” Conversational AI

Production AI agent with 10 backend services, 5 LLM-as-a-Judge evaluators, graceful degradation, SSE streaming, and full observability.

  • Raw HTTP/2 to Gemini β€” no SDK, async httpx connection pooling + thread-safe TTLCache.
  • 5 automated evaluators (Hallucination, Relevance, Conciseness, Helpfulness, Toxicity) via Langfuse at 100% sampling.
FastAPIGeminiReactSSEMem0Langfuse

FunctionGemma β€” Hybrid AI Router

2nd Place / $5K at Cactus Γ— Google DeepMind Hackathon. 3-tier hybrid inference: 0.99 F1 at 548ms, 70% on-device ratio.

  • FunctionGemma-270M local + Gemini 2.5 Flash Lite cloud with lexical pre-router.
  • Intelligent routing between local and cloud models for optimal cost/latency.
PythonGeminiLLM RoutingHackathon

PathGuard β€” Spatial Safety Intelligence

Real-time hazard detection pipeline: Grounding DINO + SAM2 + Depth Anything V2 with VLM scene understanding. UMD Γ— Ironsite Hackathon.

  • VLM generates scene-specific prompts dynamically β€” zero-shot object detection from images.
  • State machine + telemetry pipeline, deployable on RPi 5, iPhone, or Android.
PythonVLMComputer VisionHackathon

English2SQL β€” NL β†’ SQL Assistant

Production-style NL→SQL over Postgres. Achieved 9% accuracy lift via schema-aware prompting.

  • Evaluated multiple LLMs on accuracy, latency, and cost.
  • Hardened execution with schema-aware prompting + validation.
FastAPIPostgresLLM EvalDocker

AlphaFoundry β€” FF5 Factor Strategy

Rolling-window factor modeling. Backtested strategy outperformed SPY benchmark (Sharpe 0.95 vs 0.85).

  • FF5 + XGBoost Learning-to-Rank with walk-forward evaluation.
  • Production FastAPI inference + reproducible runs.
PythonXGBoostFastAPIBacktesting

Domain-Specific LLMs Research

NeurIPS 2023 Workshop paper. Proposed novel evaluation methodologies for faithful LLMs.

  • Co-authored research on LLM alignment and domain adaptation.
  • Published at NeurIPS Muslims in ML Workshop.
NeurIPSResearchLLM
More projects

Explore more of my work on GitHub.

Skills

ML / LLM
RAG Embeddings Retrieval Evaluation Guardrails LangChain Mem0 PyTorch Transformers
Agent Frameworks
LangGraph CrewAI Agentic Workflows Google ADK MCP
Backend / Infra
Python TypeScript FastAPI Docker SQL GCP AWS CI/CD Gemini
CV & Vision
Grounding DINO SAM2 Depth Anything V2 VLMs OpenCV

Certifications

DeepLearning.AI
Honors & Awards
  • Cactus Γ— Google DeepMind Hackathon β€” 2nd Place, $5K
  • UMD Γ— Ironsite Hackathon β€” Team Lead, PathGuard
  • Pear VC + OpenAI Hackathon (SF) β€” Finalist
  • NeurIPS 2023 β€” MiML Workshop paper

Contact

If you’re hiring for applied ML / retrieval / LLM evaluation roles, email is best.

I typically respond within 24 hours β˜•