Data Analyst with less than a year in Predictive Modeling & Python
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Statistics graduate, Python Developer, and Data Analyst specializing in predictive modeling, time-series analysis, and automated pipeline engineering. Proven experience managing large datasets, optimizing relational databases, and designing multi-tiered tracking frameworks to drive evidence-based organizational decisions.
Jomo Kenyatta University of Agriculture and Technology
Bachelor of Science · Statistics
August 1, 2020 – June 30, 2024
Philanthropy University
Professional Certificate · Planning for Monitoring and Evaluation
N/A – June 30, 2026
DecodeLabs Estonia
Data Science Intern
March 1, 2026 – June 1, 2026
Estonia
Querying with SQL & Python Engine
June 18, 2026 – Present
Built an analytical pipeline using SQLite and Python to audit, clean, and map thousands of transactional relational retail records. Programmed optimized SQL queries using JOIN and GROUP BY constraints to isolate sales patterns, aggregate business indicators, and capture revenue leakages. Converted zero-based structural layouts into clean, customized Pandas dataframes to deliver automated performance reports for stakeholders.
Undergraduate Research Thesis: Survival Analysis for Heart Failure Patients
June 18, 2026 – Present
Executed a biomedical analytics study to model time-to-event survival probabilities for individuals diagnosed with chronic heart failure using hospital records. Fit advanced survival frameworks including Kaplan-Meier survival curves, Log-Rank tests, and Cox Proportional Hazards models to calculate risk ratios and isolate key clinical predictors. Produced hazard tables and survival curve graphics to communicate complex clinical indicators to academic and review teams.
Netflix Dataset Cleaning Pipeline
June 18, 2026 – Present
Engineered an automated preprocessing script to filter and clean a noisy dataset of over 8,000 unique records. Extracted hidden content metrics using Regular Expressions and deployed imputation models for missing categorical metadata. Standardized erratic chronological strings into unified Datetime fields to transform corrupted inputs into audit-ready datasets.
Cultural Fit Analysis
The candidate's academic background in Statistics and diverse projects (biomedical analytics, retail data, Netflix dataset) indicate a broad interest in data applications. The internship experience with an international team suggests adaptability and a willingness to work in diverse environments. The focus on data cleaning and reporting aligns well with operational needs in many organizations.
Soft Skills & Operational Fit
The candidate demonstrates an ability to collaborate in a team setting (international development team, cross-time zone collaboration) and has experience in automating tasks to improve efficiency, indicating a proactive and problem-solving mindset. The project descriptions suggest an organized approach to data challenges.