
Pursuing Master of Science degree in Business Analytics in Simon Business School, University of Rochester. Interested in data analytics project.
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Pricing-Analytics-Choice-model-and-segmentation-
May 19, 2020 – May 19, 2020
This is a really interesting pricing analytics project using choice data in the soft drink market, it's new and it includes lots of things. I started it with choice model without segmentation, predicting demands, and choosing the optimal price. Then I used two ways for segmentation:1) do kmeans clustering using demographic data and estimate model in subsample; 2) use gmnl to estimate latent segments based on customer's purchasing history. I also use plotly and rgl package for visualising the segments.
View ProjectAirlines-Tweets-content-detection-Topic-Modeling-
May 10, 2020 – May 10, 2020
The purpose of this analysis is to train a model to identify the non-complaint tweets from thousands of tweets talking about different airlines. This is a classification problem, so with training dataset complaint1700.csv and noncomplaint1700.csv, I tried three classifying algorithms: svm, rpart, and naïve bayes and finally choose svm to be my classifier. For vectorize the tweets. I tried LDA for topic modeling, SVD for 2-dimension coordinating, and I finally choose DocumentTermMatrix which including 182 terms to vectorize the text data. The cross-validation F1-score(2/(1/precision + 1/recall)) of SVM model is about 0.676. The accuracy of the test dataset from sql database was about 0.65.
View ProjectCatholic-Family-Center
May 9, 2020 – May 9, 2020
Catholic-Family-Center — GitHub repository
View ProjectNew-Yogurt-Flavour-Project-Turf-Analysis-and-Marketing-Simulation-
May 6, 2020 – May 6, 2020
This project helps a local grocery launching new Yogurt Flavor based on Customer Survey Data using TURF Analysis
View ProjectWineRetailer-Project-Experiment-Analysis-and-Causal-Forest-
May 6, 2020 – May 6, 2020
Insight from experimental data
View ProjectToyHorse-Project-Cluster-Segmentation-and-Market-simulation-
May 6, 2020 – May 6, 2020
Product line optimisation based on conjoint analysis
View ProjectBayesian-Maching-Learning-A-B-Testing
March 18, 2020 – March 18, 2020
Here I share some notes and some codes about what I've learned about the Bayesian Machine Learning. First is a brief review of t-test and chi-square test, where I review the concept and the formula used in these two test. Secondly, in Bayesian_bandit, I create class and define some functions to simulate the bandit machine and to see how distribution has changed along with the increase of our trial. Finally, the Thompson_Sampling converge is a simulation of Thompson Sampling, where I calculate the accumulative average click through rate, so see how using Thompson Sampling can bring us to the best outcome.
View ProjectCredit-risk-project
March 12, 2020 – May 9, 2020
A predictive model of credit risk: Data Cleaning: 1. Checked and deleted the columns where missing values make up more than 90%. 2. Imputed the median of each respective numerical column and for categorical columns and the most frequent value for categorical columns. 3. Scaled them with Standardized scaler in the pipeline. Model Building: 1. Separated the data into a training set and testing set. 2. Created a function to input different models and return the results. 3. Conducted prediction with 13 models (SVC, Logit, Decision Tree, Random Forest, MPL, Decision Tree Classifier, KNN, Ada Boost, Gaussian NB, Quadratic Discriminant Analysis, Gradient Boosting, Bagging, Extra Trees), compared the performan and chose 3 models with the best performance(Gradient Boosting,SVC, Logit). Interface Building with Streamlit: 1. Created an interface to present the prediction outcomes of the credit risk performance for each customer and the accuracy and confusion matrix of different predictive models.
View ProjectCultural Fit Analysis
The candidate has a diverse portfolio of personal projects covering various domains like retail, finance, and social media analytics. This breadth suggests adaptability and a strong interest in applying data science to different business problems. The focus on personal projects indicates a self-starter mentality. However, the lack of team-based projects or professional experience makes it difficult to fully assess cultural fit in a collaborative work environment.
Soft Skills & Operational Fit
The candidate's project descriptions indicate an ability to tackle complex problems independently. However, without psychometric test results or interview data, it is difficult to assess soft skills like teamwork, communication, or stress handling. The project descriptions are detailed, suggesting good written communication for technical topics.