Ruixuan(Laura)Zhao

Data Scientist

https://talent.gravityer.com/ruixuanlaurazhao

Pursuing Master of Science degree in Business Analytics in Simon Business School, University of Rochester. Interested in data analytics project.

Member since June 29, 2026

Key Strengths

Demonstrated experience in various data science methodologies including predictive modeling, segmentation, and experimental analysis.
Proficiency in Python and R for data analysis and machine learning tasks.
Practical application of machine learning algorithms (SVM, Logit, Random Forest, Gradient Boosting) in projects like credit risk and text classification.
Experience with data preprocessing techniques such as handling missing values, imputation, and scaling.
Exposure to advanced topics like Bayesian Machine Learning and Causal Forest.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate has a diverse portfolio of personal projects covering various domains like retail, finance, and social media analytics. This breadth suggests adaptability and a strong interest in applying data science to different business problems. The focus on personal projects indicates a self-starter mentality. However, the lack of team-based projects or professional experience makes it difficult to fully assess cultural fit in a collaborative work environment.

Soft Skills & Operational Fit

The candidate's project descriptions indicate an ability to tackle complex problems independently. However, without psychometric test results or interview data, it is difficult to assess soft skills like teamwork, communication, or stress handling. The project descriptions are detailed, suggesting good written communication for technical topics.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

Projects

Pricing-Analytics-Choice-model-and-segmentation-

May 19, 2020 – May 19, 2020

This is a really interesting pricing analytics project using choice data in the soft drink market, it's new and it includes lots of things. I started it with choice model without segmentation, predicting demands, and choosing the optimal price. Then I used two ways for segmentation:1) do kmeans clustering using demographic data and estimate model in subsample; 2) use gmnl to estimate latent segments based on customer's purchasing history. I also use plotly and rgl package for visualising the segments.

View Project

Airlines-Tweets-content-detection-Topic-Modeling-

May 10, 2020 – May 10, 2020

The purpose of this analysis is to train a model to identify the non-complaint tweets from thousands of tweets talking about different airlines. This is a classification problem, so with training dataset complaint1700.csv and noncomplaint1700.csv, I tried three classifying algorithms: svm, rpart, and naïve bayes and finally choose svm to be my classifier. For vectorize the tweets. I tried LDA for topic modeling, SVD for 2-dimension coordinating, and I finally choose DocumentTermMatrix which including 182 terms to vectorize the text data. The cross-validation F1-score(2/(1/precision + 1/recall)) of SVM model is about 0.676. The accuracy of the test dataset from sql database was about 0.65.

View Project

Catholic-Family-Center

May 9, 2020 – May 9, 2020

Catholic-Family-Center — GitHub repository

View Project

New-Yogurt-Flavour-Project-Turf-Analysis-and-Marketing-Simulation-

May 6, 2020 – May 6, 2020

This project helps a local grocery launching new Yogurt Flavor based on Customer Survey Data using TURF Analysis

View Project

WineRetailer-Project-Experiment-Analysis-and-Causal-Forest-

May 6, 2020 – May 6, 2020

Insight from experimental data

View Project

ToyHorse-Project-Cluster-Segmentation-and-Market-simulation-

May 6, 2020 – May 6, 2020

Product line optimisation based on conjoint analysis

View Project

Bayesian-Maching-Learning-A-B-Testing

March 18, 2020 – March 18, 2020

Here I share some notes and some codes about what I've learned about the Bayesian Machine Learning. First is a brief review of t-test and chi-square test, where I review the concept and the formula used in these two test. Secondly, in Bayesian_bandit, I create class and define some functions to simulate the bandit machine and to see how distribution has changed along with the increase of our trial. Finally, the Thompson_Sampling converge is a simulation of Thompson Sampling, where I calculate the accumulative average click through rate, so see how using Thompson Sampling can bring us to the best outcome.

View Project

Credit-risk-project

March 12, 2020 – May 9, 2020

A predictive model of credit risk: Data Cleaning: 1. Checked and deleted the columns where missing values make up more than 90%. 2. Imputed the median of each respective numerical column and for categorical columns and the most frequent value for categorical columns. 3. Scaled them with Standardized scaler in the pipeline. Model Building: 1. Separated the data into a training set and testing set. 2. Created a function to input different models and return the results. 3. Conducted prediction with 13 models (SVC, Logit, Decision Tree, Random Forest, MPL, Decision Tree Classifier, KNN, Ada Boost, Gaussian NB, Quadratic Discriminant Analysis, Gradient Boosting, Bagging, Extra Trees), compared the performan and chose 3 models with the best performance(Gradient Boosting,SVC, Logit). Interface Building with Streamlit: 1. Created an interface to present the prediction outcomes of the credit risk performance for each customer and the accuracy and confusion matrix of different predictive models.

View Project

LauraZhao

February 23, 2020 – February 26, 2020

LauraZhao — GitHub repository

View Project

Key Strengths

Demonstrated experience in various data science methodologies including predictive modeling, segmentation, and experimental analysis.
Proficiency in Python and R for data analysis and machine learning tasks.
Practical application of machine learning algorithms (SVM, Logit, Random Forest, Gradient Boosting) in projects like credit risk and text classification.
Experience with data preprocessing techniques such as handling missing values, imputation, and scaling.
Exposure to advanced topics like Bayesian Machine Learning and Causal Forest.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

Ruixuan(Laura)Zhao

Key Strengths

Cultural & Operational Fit

Top Skills

Skills

Projects

Key Strengths

Cultural & Operational Fit