During my experience as a data analyst, I managed a diverse range of responsibilities. These include automating data processing workflows, creating centralized dataset from multiple sources, enhancing the customer lifetime value (LTV) model, performing retention analysis and creating dashboards for cross-functional teams. My work generates analytical insights that drive business performance improvements. Additionally, I have developed a thorough understanding of subscription-based customer lifecycle journey, from enrollment to renewal. I am proficient in analyzing and interpreting growth metrics, such as activation rates, churn rates, retention rates and other key performance indicators.
In my free time, I love playing sports, trying out new recipes, and getting lost in a good book. Weekends are all about hanging out with family and friends. Lately, I've been really into pingpong and playing board games with friends. Basketball is a big passion of mine, and I’m always up for a game.
Fraudulent transactions in financial payments represent a significant challenge for the global financial industry. In this project, I performed exploratory analysis to understand the fraud transactions on a bank simulated dataset, and I also applied machine learning methods to detect fraudulent activities.
Electric Vechicle has been very popular in the automative market. It is an observable trend that the global electric vehicle market has been expanding quickly in recent years. To understand the development of the electric vehicle market, this project studies the data of electric vehicles registered in WA from 1997 to 2023.
Customer reviews are valuable assets for most companies. In this project, I applied Natural Language Processing technique to cluster customers' reviews on an E-commerce site, and identify the latent topics of these review texts.
I developed supervised learning algorithms for customer churn prediction in this project. The labelled data in this data set is imbalanced, so I applied SMOTE for oversampling. Besides, I applied encoding, standardization technique to transform the features. Logistic Regressions, KNN, Random Forest algorithms are used for modeling. Model evaluation involves metircs like f1-score, ROC and AUC scores.
In this project, I applied my ML skills to construct models that can detect fraud credit card transactions on a highly imbalanced dataset, in which only less than 1% transactions are considered fraud. Random downsampling technique is used to handle the imbalance data.
This project aims at developing an interactive map to visualize the flight paths in the United States. You can choose any airport in the U.S. and the map will show all the destinations that you can go from the selected airport.
This analysis aims for developing a machine learning model to classify heart disease using data collected through non-invasive procedure. The final model achieves 84% accuracy and has a false positive rate of 18%.
[yaohong010@gmail.com]
Send Email