Devops Engineer with 3+ years in CI/CD, Kubernetes & Cloud
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Results-driven DevOps Engineer with 3+ years of experience in automating CI/CD pipelines, container orchestration (Kubernetes, Docker), GPU infrastructure management, and infrastructure monitoring (Prometheus, Grafana). Proven track record of improving system uptime, performance, and deployment efficiency in production environments across government, research, and enterprise clients. Skilled in Jenkins, Ansible, NVIDIA GPU stack, and Linux systems. Seeking a challenging DevOps role in a high-growth tech company to drive automation, stability, and innovation.
Amity University, Noida
Bachelor of Computer Application · Cloud And Security
N/A – June 30, 2022
CCS, Delhi
Kubernetes Engineer
August 1, 2025 – Present
Delhi, Delhi, India
Keen & Able Computers Pvt. Ltd
DevOps Engineer
March 1, 2023 – July 1, 2025
Noida, Uttar Pradesh, India
NIC Shastri Park - AI Workload Deployment on Kubernetes
June 10, 2026 – Present
Designed, configured, and managed production-grade Kubernetes clusters for AI-based applications, ensuring high availability and scalability. Performed end-to-end Kubernetes cluster setup, including node configuration, networking, and workload orchestration. Deployed and managed AI/ML workloads on Kubernetes using scalable deployment strategies (Deployments, Services, Autoscaling). Installed and configured NVIDIA drivers and CUDA toolkit to enable GPU acceleration for AI workloads. Set up and managed the NVIDIA BCM cluster on DGX servers for high-performance AI application deployment. Integrated GPU-enabled infrastructure with Kubernetes for efficient resource utilization. Implemented container image management using Harbor registry with secure access control and image versioning. Configured Kubernetes monitoring using Metrics Server for resource utilization tracking (CPU/Memory). Deployed and configured Prometheus, Node Exporter, and Grafana for infrastructure and cluster monitoring. Built custom Grafana dashboards using PromQL queries to visualize system metrics and application performance. Configured Alertmanager with custom alerting rules to proactively monitor system health and trigger notifications. Ensured end-to-end observability and reliability of AI applications and infrastructure.
IIT Jodhpur - Kubernetes & GPU Infrastructure Setup
June 10, 2026 – Present
Set up end-to-end Kubernetes cluster at IIT Jodhpur campus for AI workload deployment, including node configuration, networking, and orchestration. Installed and configured NVIDIA drivers and CUDA toolkit on GPU servers to enable GPU-accelerated AI computing. Deployed and configured Prometheus and Grafana for comprehensive monitoring of Kubernetes cluster and GPU servers. Set up monitoring for the Slurm HPC cluster, enabling visibility into job scheduling, node health, and resource utilization. Configured Alertmanager with custom alerting rules for proactive issue detection and notifications. Deployed AI workloads on Kubernetes and provided ongoing client support for production issues and infrastructure queries.
NIC (National Informatics Centre) - UPSC Portal
June 10, 2026 – Present
Designed and implemented end-to-end CI/CD pipelines using Jenkins, automating build, test, and deployment processes on Kubernetes clusters, reducing manual efforts by 70%. Integrated Gitea for source control, enabling seamless build and deploy workflows via Jenkins pipelines directly to Kubernetes environments. Deployed Grafana and Prometheus across Production, Staging, and Dev environments for real-time system and Redis monitoring, and built custom dashboards to track app-level metrics. Created a custom Prometheus exporter to monitor a MongoDB Sharded Cluster, improving observability and troubleshooting time by 40%. Configured centralized logging with rsyslog across multiple environments for system-wide log management. Set up Logstash pipelines to parse and forward logs to OpenSearch and built searchable dashboards to monitor application behavior and system errors.
NIC - Shiksha Portal
June 10, 2026 – Present
Deployed and managed Java Struts applications on Apache Tomcat containers. Implemented real-time application monitoring with Prometheus, JMX Exporter, and Grafana, ensuring high system availability.
Thomas Cook - Database Support
June 10, 2026 – Present
Managed MySQL database support including replication, cascading replication, and GTID replication. Wrote MySQL queries and automated MySQL backups using Ansible with email alert notifications.
E-Prison - Database Migration
June 10, 2026 – Present
Migrated enterprise-level data from Microsoft SQL Server to PostgreSQL using pgloader with thorough data validation. Automated PostgreSQL backups with monitoring checks and email alert notifications for backup failures.
Identity Management System - SCT Project
June 10, 2026 – Present
Deployed Keycloak for containerized environments and integrated LDAP for secure user authentication. Built a custom Prometheus exporter to monitor Keycloak metrics, enhancing observability with Grafana dashboards.
Linux System Administration - Sheela Foam
June 10, 2026 – Present
Managed Linux-based infrastructure including mail server setup (Postfix, Dovecot), system optimization, and troubleshooting.
Cultural Fit Analysis
The candidate's project experience spans government, research, and enterprise clients, indicating adaptability to diverse organizational cultures. The focus on automation, efficiency, and observability aligns well with modern DevOps practices. The breadth of technologies and problem domains (AI/ML, databases, CI/CD, monitoring) suggests a proactive and continuous learning mindset, which is a strong cultural fit for dynamic tech environments.
Soft Skills & Operational Fit
The candidate's resume highlights 'Communication', 'Time Management', 'Problem-Solving', and 'Leadership' as soft skills. The project descriptions indicate a practical application of problem-solving in setting up complex systems and troubleshooting. The experience with client support and managing production environments suggests good operational fit and reliability.