Data Science with less than a year in Data Validation & AI/ML
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Aspiring Data Operations Analyst and B.Tech Computer Science graduate (2026) with a strong foundation in data validation, SQL-based data analysis, ETL pipeline development, and process quality assurance. Experienced in identifying and resolving data and process-related discrepancies, managing structured workloads to meet tight delivery schedules, and applying systematic analytical thinking to improve data quality and coding accuracy. Proficient in Python and SQL with hands-on experience processing large-scale datasets (100K+ records) in a professional environment. Adept at cross-functional collaboration, SOP adherence, and translating complex data into actionable, client-ready outputs.
ITM Vocational University, Vadodara
B.Tech · Computer Science & Engineering (Data Science Specialisation)
August 1, 2022 – June 30, 2026
Apollo Tyres Ltd.
Data Science Intern
July 1, 2025 – September 1, 2025
India
AI-Powered Adaptive Testing System
June 1, 2026 – Present
Built a real-time adaptive assessment engine processing 10,000+ student responses across 500+ question items, using Item Response Theory (IRT) to dynamically calibrate difficulty - all response data logged with structured schema into MongoDB. Ensured data accuracy and consistency of all test session records through structured validation at the API layer (FastAPI), enforcing data type checks, range constraints, and mandatory field validation on every incoming request. Handled structured datasets of user performance metrics, question metadata, and session logs across 3 normalised MongoDB collections; performed validation checks to prevent corrupt or incomplete records from affecting adaptive scoring logic.
Multi-Source ETL Data Pipeline & Validation Framework
June 1, 2026 – Present
Designed and implemented a production-grade, multi-source ETL pipeline processing data from 4 distinct source types (CSV, JSON, APIs, relational DBs), automating ingestion, transformation, and Star Schema warehouse loading using Apache Airflow DAGs. Developed a data processing system for automated data profiling, validation, anomaly detection, and structured reporting executing 6 sequential quality checks per data batch including null detection, type validation, range checks, and referential integrity. Ensured data accuracy and consistency across all pipeline stages by enforcing schema compliance rules and automated QA gates; reduced manual data correction effort by over 40% compared to baseline processing. Handled structured datasets of 50,000-200,000 records per pipeline run; performed validation checks at each stage and generated audit-ready summary reports by category, replicating client deliverable workflows used in FMCG/CPG data operations.
AI Data Intelligence Platform
June 1, 2026 – Present
Architected an AI-powered data platform processing structured business datasets from 3 source types (CSV, SQL databases, REST APIs), automatically generating validated, category-level analytical insights - mirroring client deliverable workflows in data operations environments. Developed a data processing system for automated data profiling, validation, anomaly detection, and structured reporting replacing manual analysis steps with a repeatable, SOP-aligned pipeline that reduced insight generation time by ~70%. Ensured data accuracy and consistency through multi-layer validation checks (completeness, type integrity, outlier detection) on datasets of up to 500,000 rows before any analytical output was produced. Handled structured datasets across 10+ business categories, performing validation checks and generating category-level summary reports - directly replicating the data coding and output delivery workflow central to data operations roles.
Dynamic RAG Chatbot with Automatic Knowledge Base Updates
June 1, 2026 – Present
Engineered a Retrieval-Augmented Generation (RAG) system handling 1,000+ document chunks across a dynamic knowledge base, with automated ingestion, embedding, and vector-store upsert - ensuring data accuracy and consistency of retrieved context at all times. Performed validation checks on retrieved outputs using a quality benchmarking harness across 200+ test queries, ensuring response accuracy exceeded a defined 85% threshold before client-facing deployment. Handled structured datasets of document metadata, embeddings, and query logs stored in a vector database; maintained data integrity through automated consistency checks on each knowledge base update cycle.
Applied Artificial Intelligence: Practical Implementations
TechSaksham (Microsoft & Edunet)
June 1, 2026 – Present
Generative AI Professional
Oracle
June 1, 2026 – Present
Full Stack Developer Bootcamp
GeeksforGeeks
June 1, 2026 – Present
Generative AI
LinkedIn Learning
June 1, 2026 – Present
Cultural Fit Analysis
The candidate's academic projects showcase a diverse range of applications for data science, from ETL pipelines and AI data platforms to RAG chatbots and adaptive testing systems. This breadth of interest and application, coupled with participation in hackathons and ideathons, suggests a proactive, problem-solving mindset. The internship experience at Apollo Tyres Ltd. demonstrates an ability to work in a structured corporate environment and contribute to real-world data quality initiatives. The target role of Data Science aligns well with their demonstrated skills and project focus on data processing, validation, and analytical insights.
Soft Skills & Operational Fit
The candidate demonstrates strong operational fit through their experience in developing and maintaining data validation SOPs, managing high-volume workloads, and consistently meeting deadlines. Their project descriptions highlight an ability to translate complex data into actionable insights and collaborate cross-functionally. The emphasis on structured reporting, quality checks, and audit-ready summaries indicates a methodical and detail-oriented approach to data operations.