Data Science with less than a year in Machine Learning & NLP
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Data Science enthusiast skilled in Python, SQL, Machine Learning, NLP, and LLMS, with expertise in tools like Excel, Power BI, Tableau, SPSS, QuickBooks, and Stata for data analysis and visualization. Passionate about leveraging data-driven insights and AI solutions to optimize operations, support decision-making, and drive measurable business impact.
National University of Computer and Emerging Sciences
Bachelors of Science · Business Analytics
August 1, 2022 – June 30, 2026
Punjab Information Technology Board (PITB)
Data Science Intern
June 1, 2025 – August 31, 2025
Lahore, Punjab, Pakistan
Quora Duplicate Question Detection using NLP
July 1, 2025 – July 31, 2025
Built an end-to-end NLP pipeline to identify duplicate questions using the Quora dataset. Implemented preprocessing, feature engineering, and multiple feature extraction methods including N-Grams, Bag-of-Words (BOW), TF-IDF, and Word2Vec. Applied stratified k-fold validation and optimized hyperparameters to achieve 0.7934 accuracy. Evaluated model performance using a confusion matrix for detailed error analysis.
AI Chatbot Assistant (Text-to-SQL using LLMs & Streamlit)
July 1, 2025 – August 31, 2025
Developed an interactive chatbot in Streamlit using 2 different Large Language Models: Qwen LLM to convert natural language queries into SQL for seamless database exploration. Integrated contextual conversation memory to support follow-up questions and maintain coherent multi-turn interactions. Automated SQL execution and used Llama LLM to summarize query results, providing clear and user-friendly insights. Enhanced usability with a responsive interface, enabling both technical and non-technical users to query databases without writing SQL, potentially reducing query time.
Customer Churn Prediction using Machine Learning
June 1, 2025 – June 30, 2025
Developed an end-to-end ML pipeline incorporating 10 feature selection methods, 10 oversampling techniques, and 10 classifiers to handle class imbalance and enhance model accuracy. Applied cross-validation and hyperparameter tuning to optimize model performance, achieving 80.91% accuracy with SelectFpr + MDO SMOTE and CatBoost Classifier. Conducted thorough evaluation and validation to ensure model robustness and generalizability.
FOOD SECTOR RISK REVIEW
April 1, 2024 – May 31, 2025
Conducted an in-depth analysis of the food sector, identifying and quantifying over 10 key risks affecting business operations and profitability. Utilized Excel for data collection, ratio analysis, and CAPM regression modeling to quantify risk factors. Created interactive dashboards in Power BI, improving data accessibility and decision-making speed by approximately 35%. Prepared detailed reports using Microsoft Word, presenting findings and recommending effective risk mitigation strategies to stakeholders.
Harnessing the Power of Data with Power BI
Coursera
January 1, 2025 – Present
Preparing Data for Analysis with Microsoft Excel
Coursera
January 1, 2025 – Present
AI for Everyone
Coursera
January 1, 2025 – Present
Cultural Fit Analysis
The candidate's project diversity, ranging from AI chatbots to churn prediction and risk analysis, indicates a broad interest and adaptability. Their academic background in Business Analytics combined with practical data science projects suggests a good fit for roles that require both technical depth and business understanding. The internship experience at a government board (PITB) also shows exposure to structured work environments. However, the limited professional experience (internship only) means cultural fit beyond technical alignment is less established.
Soft Skills & Operational Fit
The candidate demonstrates strong analytical thinking and problem-solving skills through their project descriptions, particularly in optimizing ML pipelines and identifying risks. Their experience in creating interactive dashboards and reports suggests good communication of technical insights to non-technical stakeholders. The focus on data-driven decision-making aligns well with operational needs.