MANISH YADAV

AI Engineer

https://talent.gravityer.com/manish-yadav-680195

AI Engineer with less than a year in Agentic AI systems, LLM fine-tuning, RAG pipelines, and end-to-

Key Strengths

Deep expertise in Agentic AI systems, LLM fine-tuning, and RAG pipelines, directly aligning with the AI Engineer role.
Strong practical experience in MLOps, including CI/CD, model lifecycle management (MLflow, DVC), and real-time monitoring (Grafana).
Proficiency in building low-latency, high-throughput inference services using FastAPI, WebSockets, and Kubernetes.
Demonstrated ability to optimize deep learning models at a kernel level (Flash Attention, CUDA kernels, BF16 AMP) for significant performance gains.
Experience with multimodal retrieval systems and advanced NLP techniques.
Solid foundation in core programming (Python, Java, SQL) and data structures/algorithms (LeetCode, GATE DA qualification).

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's project portfolio demonstrates a strong alignment with cutting-edge AI/ML development, particularly in agentic systems and MLOps, which are highly relevant to an AI Engineer role. The diversity of projects, from distributed orchestrators to kernel optimization and MLOps pipelines, indicates a broad technical curiosity and adaptability. Their academic background with an AI & ML specialization and national-level certification (GATE DA) further reinforces a dedicated interest in the field. The emphasis on production-grade systems and performance optimization suggests a pragmatic and impact-driven approach.

Soft Skills & Operational Fit

The candidate's project descriptions indicate a strong problem-solving aptitude, evidenced by optimizing LLM latency, eliminating race conditions, and resolving memory/compute bottlenecks. Their role as a 'Technical Peer Mentor' suggests leadership potential and a collaborative mindset. The focus on production-grade systems and real-time performance implies a results-oriented and detail-conscious approach.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

About

Production-focused AI/ML Engineer specialising in Agentic AI systems, LLM fine-tuning, RAG pipelines, and end-to-end MLOps. Built multi-agent orchestration frameworks (LangGraph), multimodal retrieval systems, and low-latency FastAPI inference services on GCP/AWS. Reproduced GPT-2 (124M) from scratch achieving 11x training throughput via Flash Attention and CUDA kernel optimisation. Strong in PyTorch, Transformers, NLP, and production deep learning systems. GATE DA 2026 Qualified 300+ DSA.

Top Skills

PythonTensorFlowRest ApisStreamlit

Projects

NexusStream: Distributed Cross-Device Multi-Agent AI Orchestrator

June 1, 2024 – June 1, 2026

Architected a single-pass LangGraph agentic core consolidating NLP classification, expert routing, and self-criticism into one execution cycle - reducing local 7B LLM latency by 66% through pipeline-level optimisation. Built a full-duplex FastAPI + WebSocket streaming backend broadcasting real-time AI outputs to cross-device clients; implemented async concurrency controls eliminating race conditions under high-frequency event streams.

View Project

JARVIS: Multi-Tenant Distributed Autonomous Agent Platform

June 1, 2024 – June 1, 2026

Engineered a multi-tenant agentic orchestration layer using LangGraph and LiteLLM with concurrent browser automation via CDP; built a Semantic DOM Parser coupling LLM reasoners with VLMs to eliminate hardcoded selector fragility. Implemented an on-the-fly contextual RAG memory engine using PyMuPDF to ingest unstructured PDFs and compile user context into prompt windows for real-time asset generation with zero-trust data privacy.

View Project

Automated Sales Forecasting - End-to-End MLOps Pipeline

June 1, 2024 – June 1, 2026

Orchestrated full MLOps pipeline with Airflow DAGs and batch inference achieving 95% MAPE accuracy, zero manual intervention; managed model lifecycle via MLflow + DVC with real-time Grafana observability and regression test suite.

View Project

GPT-2 (124M) Production Training & Kernel Optimisation

June 1, 2024 – June 1, 2026

Reproduced GPT-2 (124M) Transformer from scratch; resolved memory/compute bottlenecks via Flash Attention, fused CUDA kernels, and BF16 AMP - boosting training throughput 11x to 2,743 tok/sec on NVIDIA T4 with loss 10.94 → 0.75. Outperformed original GPT-2 checkpoint on HellaSwag benchmark through systematic kernel-level optimisation; built production-grade mixed-precision training pipeline with scalable configuration.

View Project

Automated US Visa Approval - Real-Time ML Inference Microservice

June 1, 2024 – June 1, 2026

Engineered REST API inference microservice at <1 ms latency, 96.8% accuracy, 7x faster than baseline; deployed via Kubernetes HPA + CI/CD; integrated Evidently AI for automated model evaluation and drift detection in production.

View Project

Multi-Agent Reasoning System (LangGraph + RAG + GCP)

June 1, 2024 – June 1, 2026

Designed cyclic multi-agent workflow with autonomous tool-use; deployed FastAPI backend on GCP with WebSockets; implemented Multimodal RAG pipeline on MongoDB Atlas achieving 0.9 Answer Relevance / 0.65 Context Precision via LLM-as-judge eval.

View Project

Certifications

GATE DA 2026 Qualified - Data Science & AI (National-level; validates ML, stats, analytical reasoning)

Unknown

June 1, 2026 – Present

300+ LeetCode - DSA: Arrays, Trees, Graphs, DP, Hashing (Python & Java)

LeetCode

June 1, 2026 – Present

Key Strengths

Deep expertise in Agentic AI systems, LLM fine-tuning, and RAG pipelines, directly aligning with the AI Engineer role.
Strong practical experience in MLOps, including CI/CD, model lifecycle management (MLflow, DVC), and real-time monitoring (Grafana).
Proficiency in building low-latency, high-throughput inference services using FastAPI, WebSockets, and Kubernetes.
Demonstrated ability to optimize deep learning models at a kernel level (Flash Attention, CUDA kernels, BF16 AMP) for significant performance gains.
Experience with multimodal retrieval systems and advanced NLP techniques.
Solid foundation in core programming (Python, Java, SQL) and data structures/algorithms (LeetCode, GATE DA qualification).

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

MANISH YADAV

Key Strengths

Cultural & Operational Fit

About

Top Skills

Skills

Education

Projects

Certifications

Key Strengths

Cultural & Operational Fit