Yihua Liu, Data Science Developer in Orlando, FL, United States
Yihua Liu

Data Science Developer in Orlando, FL, United States

Member since July 28, 2021
Yihua is a lead data scientist with over a decade of experience across various companies and teams. With several industry journal publications, speaking engagements, and extensive client-facing experience, he enjoys sharing and discussing his work with audiences of all backgrounds, including C-suite executives and non-technical stakeholders.
Yihua is now available for hire


  • SimIS
    SQL, Machine Learning, Artificial Intelligence (AI), Python
  • Accenture
    Tableau, SQL



Orlando, FL, United States



Preferred Environment

Python 3, SQL, Machine Learning, Analytics, Artificial Intelligence (AI), Tableau

The most amazing...

...and highest-impact project I've worked on is Covered California, the state of California's health insurance marketplace.


  • Senior Data Scientist

    2018 - 2021
    • Improved learner behavior prediction accuracy from a 21% baseline (recommended next action) to 66% on unseen test data via a long short-term memory recurrent neural network (LSTM RNN) model.
    • Predicted course completion with Matthews correlation coefficient 0.51 using Experience API (xAPI) student log data and built the corresponding explanatory model via factor analysis.
    • Co-authored an e-learning metadata analytics strategy distributed across the Department of Defense and chaired the stakeholder working group on its adoption and implementation.
    Technologies: SQL, Machine Learning, Artificial Intelligence (AI), Python
  • Business Analyst

    2012 - 2013
    • Eliminated test case redundancies via Excel data analysis, cutting testing time by nearly 15%.
    • Led deliverable review sessions with high-level stakeholders across multiple teams to ensure business requirement compliance.
    • Performed ad hoc defect analysis to facilitate efficient prioritization of cross-functional effort.
    Technologies: Tableau, SQL


  • Educational Outcomes Prediction

    This project examines educational data—including student demographic information and academic records, school attributes, and teacher data—from kindergarten through third grade for a diverse cohort of students.

    In the exploratory phase, we find methods to reduce the minority achievement gap and improve all students' outcomes. Next, we attempt to predict future test scores via several regression models. Finally, we predict whether students will graduate from high school and whether they will take a college entrance examination—SAT or ACT.

    Although these events occur nearly a decade after third grade for most students, we were able to perform relatively well, with ROC AUC (area under the curve) scores between 0.7 and 0.8 on unseen test data.


  • Languages

    SQL, Python 3, Python
  • Other

    Machine Learning, Analytics, Artificial Intelligence (AI), Mathematics, Applied Mathematics, Statistics, Big Data
  • Tools



  • Master's Degree in Statistics
    2015 - 2016
    University of Central Florida - Orlando, FL
  • Bachelor's Degree in Mathematics
    2008 - 2011
    University of California, Berkeley - Berkeley, CA

To view more profiles

Join Toptal
Share it with others