Data Engineer with less than a year in real-time ETL pipelines, data warehousing, and predictive mod
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Highly motivated and results-oriented student pursuing a Bachelor of Engineering in Artificial Intelligence & Machine Learning, graduating in June 2026. Proficient in Data Analytics/Engineering tools including SQL, Kafka, Spark, Flink, and cloud platforms like AWS Redshift and Azure Synapse. Demonstrated ability to design and implement robust data pipelines, optimize querying, and build predictive models for various analytical projects. Eager to apply strong technical skills and problem-solving abilities to impactful data-driven challenges.
BMS College Of Engineering
Bachelor of Engineering · Artificial Intelligence & Machine Learning
January 1, 2022 – June 1, 2026
Data Processing Pipeline for Goodreads Analytics
January 1, 2025 – June 1, 2026
Built a real-time ETL pipeline for distributed data processing of 1.6 TB/day with Apache Kafka and Apache Spark. Optimized querying in Redshift and implemented UPSERT operations which reduced data redundancy and ensuring data integrity. Built a CI/CD pipeline with Docker and Airflow to automate testing, deployment, and versioning of data workflows.
Spotify User Tred Analysis
December 1, 2024 – June 1, 2026
Designed ETL pipelines to build a data warehousing solution using Power BI reports to analyze user data trends over time. Automated data quality checks for 1TB+ user data resulting in 95% accuracy through schema validation, integrity, and null value checks. Optimized SQL transformations by refactoring queries, using indexing resulting in reduced query execution time by 80%.
Customer Purchase Prediction Analysis
November 1, 2024 – June 1, 2026
Designed multi-layer medallion data architecture in PostgreSQL, and Redis for optimizing customer purchase prediction model. Implemented schema validation jobs using Flink, ensuring data quality and routing invalid events to Elasticsearch for alerting. Developed a Kafka-Spark Streaming ETL pipeline to process 10,000+ events/second in real-time for data analysis and ML workflows.
Customer Behavior Analytics Project
October 1, 2024 – June 1, 2026
Built predictive models using regression and statistical modeling, improving forecasting accuracy of customer purchases by 18% compared to baseline. Performed hypothesis testing and A/B testing, identifying marketing strategies that drove a 12% increase in engagement. Conducted exploratory data analysis (EDA) with correlation analysis, PCA, and outlier detection, reducing noise and improving data quality by 15%.
Credit Risk Analysis & Loan Default Prediction
June 1, 2024 – June 1, 2026
Conducted data wrangling and exploratory analysis (EDA) using Python and SQL which identified trends in default behavior across income groups, loan amounts, and credit history. Delivered actionable insights that reduced the default rate by 10% through optimized risk scoring and revised loan approval criteria.
Lead - Cancer Awareness Drive
December 1, 2023 – December 1, 2023
Organized and led a community health initiative to raise awareness about cancer prevention and early detection. Coordinated volunteers and partnered with local organizations to host interactive sessions and distribute educational materials. Reached 200+ participants, driving stronger engagement in preventive healthcare practices.
Cultural Fit Analysis
The candidate's academic projects demonstrate a strong interest and practical application in data engineering and analytics, aligning well with a Data Engineer role. The diversity of projects, from real-time ETL to predictive modeling and data warehousing, shows a broad technical curiosity. The 'Lead - Cancer Awareness Drive' project indicates a proactive and socially conscious individual, which can be a positive cultural asset. However, the candidate is still pursuing a Bachelor's degree with no professional experience, which might impact immediate cultural integration into a senior role.
Soft Skills & Operational Fit
The candidate's project descriptions indicate an ability to work on complex problems and deliver measurable results. The 'Lead - Cancer Awareness Drive' project suggests leadership potential and community engagement, which could translate to good teamwork and initiative. However, without direct work experience, it's difficult to fully assess operational fit and professional communication in a corporate setting.