Peiyao Li, Developer in Beijing, China
Peiyao is available for hire
Hire Peiyao

Peiyao Li

Verified Expert  in Engineering

Data Scientist and Developer

Location
Beijing, China
Toptal Member Since
July 30, 2022

Peiyao has more than a decade of experience in data science as a project manager, data engineer, and machine learning engineer. He holds a master's degree in electrical engineering, and his strengths include Python and PostgreSQL. In the past years, he has served for top hospital and biotech companies in the US and China. Peiyao's interdisciplinary background makes him a leading candidate in the healthcare and pharmaceutical industry and associated domains.

Portfolio

BeiGene
Data Engineering, ETL, GIS, TensorFlow, Classification, Data Pipelines...
Global Health Drug Discovery Institute (GHDDI)
Artificial Intelligence (AI), Data Engineering, Machine Learning, R
PLA General Hospital
Data Engineering, Time Series Analysis, Kalman Filtering

Experience

Availability

Full-time

Preferred Environment

Linux, Python 3

The most amazing...

...project I've managed is creating the first large-scale clinical data warehouse for critical care in China and developing an early disease warning system.

Work Experience

Senior Research Manager

2021 - PRESENT
BeiGene
  • Developed a dashboard for drug target identification by leveraging commercial and public data.
  • Built a virtual screening platform using cutting-edge big data analytics.
  • Designed and implemented the data curation pipeline for in-house experiment datasets.
Technologies: Data Engineering, ETL, GIS, TensorFlow, Classification, Data Pipelines, Text Classification, Language Models, GPT

Principal Data Scientist

2020 - 2021
Global Health Drug Discovery Institute (GHDDI)
  • Led the data science team to curate public and commercial chemical, biomedical, and clinical databases to support in-house AI applications.
  • Managed the development of a ligand-based virtual screening dashboard.
  • Collaborated with Peking Union Medical College Hospital to develop precision medicine applications in rare diseases with natural language processing techniques.
Technologies: Artificial Intelligence (AI), Data Engineering, Machine Learning, R

Research Engineer

2016 - 2020
PLA General Hospital
  • Led the machine learning-based algorithm platform development for wearable device applications. The platform could support sleep and cardiac status monitoring by leveraging AI techniques.
  • Cooperated with clinicians in the cardiac department to build a myocardial infarction death prediction model with an area under the curve (AUC) of 0.86 and a stent restenosis model with an AUC of 0.82.
  • Managed the clinical data warehouse construction for the emergency department, including 30,000 patients supporting the clinical model development.
Technologies: Data Engineering, Time Series Analysis, Kalman Filtering

Research Engineer

2012 - 2016
Washington University in St. Louis
  • Analyzed the super-resolution microscopy images with customized Python scripts.
  • Optimized the analysis method through compressed sensing to improve image resolution.
  • Built the protein spatial structure from collected image data using the k-means algorithm, which pushed the resolution to 40 nanometers.
  • Implemented the regression model to quantify the calcium channel composition using the step photobleaching technique.
Technologies: MATLAB, Python 3, Computer Vision

Scientific Application Developer Co-op

2012 - 2012
Monsanto
  • Improved the web service response time by 50% after converting the point-of-sale service framework to the Apache CXF framework using Spring and Hibernate technologies.
  • Obtained the Apache CXF framework's performance matrix and analyzed its performance.
  • Summarized key results and methodologies to prepare for the presentation.
  • Created 20 Apache CXF web service interfaces for the experiment curation service.
  • Collaborated with other co-ops to contribute to refactoring over 1,600 files in the field trial service.
Technologies: Java

Compound ADMET Properties Prediction and Visualization Platform

Compound ADMET properties are crucial in drug design and discovery.

In this project, I used deep neural networks to build a multi-task model to predict compound ADMET properties and visualize results using the Plotly framework.

AI-based Sleep Stage Classification and Apnea Detection Wearable System

Sleep stage classification and apnea detection are crucial for sleep disorder diagnosis. For this project, I worked as the principal data scientist for system architecture design, data pipeline construction, and algorithm development. The system could monitor patients' sleep status with physiological features in real time and help clinicians decide on treatment. Now, this system has been deployed in China's top hospital

Critical Care Data Warehouse Construction

In this project, we curated clinical data from eight critical care departments and constructed China's first-ever critical care data warehouse. This warehouse included structure data from electronic health records and time series physiological signals. Several early warning systems for deterioration prediction have been developed based on this warehouse

Languages

Python 3, Python, R, SQL, Java

Paradigms

Data Science, ETL, Agile Software Development

Storage

PostgreSQL, Data Pipelines

Other

Machine Learning, Microsoft Office, Data Visualization, Data Engineering, Artificial Intelligence (AI), Data Analysis, Data Analytics, Time Series Analysis, Classification, Text Classification, Kalman Filtering, Language Models, GPT, Signal Processing, Computer Vision

Libraries/APIs

PyTorch, Keras, TensorFlow

Tools

MATLAB, GIS

Platforms

Linux

2010 - 2012

Master's Degree in Electrical Engineering

Washington University in Saint Louis - Saint Louis, Missouri, United States

2006 - 2010

Bachelor's Degree in Electrical Engineering

Chongqing University of Posts and Telecommunications - Chongqing, China

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring