Hi, my name is Yaohong Liang
I'm currently living in Chicago and working as a Data Analyst @ GoHealth. I view my self as a Life-long learner and data science enthusiast.

Know more

About me

Profile Image

I have a passion for discovering new ideas and applying my analytical mindset to tackle challenges. During my academic tenure, I engaged in different research projects including firm characteristics analysis, computer vision, specifically human action detection, and study reinforcement learning techniques for stock trading. I also contributed as a teaching assistant for QMSS 5072 (Modern Data Structure), enriching my educational journey. Prior to my studies at Columbia, I collaborated with a European research institute, enhancing data processing through script writing. Currently, as a Data Analyst at GoHealth, my role encompasses a broad spectrum of responsibilities. These include automating data processing workflows, refining the customer lifetime value (LTV) model, ensuring the integrity of SQL databases, and generating analytical insights to drive business performance improvements.

In my leisure time, I enjoy engaging in sports, experimenting with new recipes, and delving into books. Weekends are reserved for quality moments with my family and friends. Lately, my interests have expanded to include exploring museums and indulging in board games with my friends. Basketball holds a special place in my heart, serving as both a passion and a companion. I'm keen on connecting with fellow enthusiasts in the city to share some games together.

Projects

Electric Vehicle Market Analysis

Electric Vechicle has been very popular in the automative market. It is an observable trend that the global electric vehicle market has been expanding quickly in recent years. To understand the development of the electric vehicle market, this project studies the data of electric vehicles registered in WA from 1997 to 2023.

See Live Source Code

Document Clustering and Topic Modeling

Customer reviews are valuable assets for most companies. In this project, I applied Natural Language Processing technique to cluster customers' reviews on an E-commerce site, and identify the latent topics of these review texts.

See Live Source Code

Customer Churn Prediction

I developed supervised learning algorithms for customer churn prediction in this project. The labelled data in this data set is imbalanced, so I applied SMOTE for oversampling. Besides, I applied encoding, standardization technique to transform the features. Logistic Regressions, KNN, Random Forest algorithms are used for modeling. Model evaluation involves metircs like f1-score, ROC and AUC scores.

See Live Source Code

Credit Card Fraud Detection

In this project, I applied my ML skills to construct models that can detect fraud credit card transactions on a highly imbalanced dataset, in which only less than 1% transactions are considered fraud. Random downsampling technique is used to handle the imbalance data.

See Live Source Code

Flight Paths Visualization

This project aims at developing an interactive map to visualize the flight paths in the United States. You can choose any airport in the U.S. and the map will show all the destinations that you can go from the selected airport.

See Live Source Code

Heart Disease Detection

This analysis aims for developing a machine learning model to classify heart disease using data collected through non-invasive procedure. The final model achieves 84% accuracy and has a false positive rate of 18%.

See Live Source Code

Contact

[yaohong010@gmail.com]

Send Email