Yilong Li
Verified Expert in Engineering
Data Scientist and Developer
Cambridge, MA, United States
Toptal member since August 12, 2021
Yilong is a seasoned data scientist specialized in oncology and cancer genomics research. He completed his bioinformatics PhD at the University of Cambridge and has several publications in top scientific journals such as Nature, Science and Cell. After that, he worked in various R&D roles in the industry, focusing on genomics algorithm development, bioinformatics data analysis, and machine learning. Yilong follows best programming practices in his coding and data science projects.
Portfolio
Experience
- Genomics - 12 years
- Bioinformatics - 12 years
- Transcriptomics - 12 years
- Oncology & Cancer Treatment - 12 years
- Computational Biology - 12 years
- R - 12 years
- CRISPR/Cas9 - 8 years
- Python - 5 years
Availability
Preferred Environment
Python, R, Linux
The most amazing...
...study I've published as part of a large international cancer genome sequencing involved analyzing several terabytes of cancer genome sequencing data.
Work Experience
Senior Scientist | Bioinformatics
AbbVie Inc.
- Studied cell type changes in immunological diseases using single-cell RNA sequencing.
- Analyzed in vivo genome-wide CRISPR/Cas9 knock-out screening data.
- Explored differential gene expression data in clinical datasets.
VP | Platform and Collaborations
Totient
- Used large-scale genomics to identify two new target genes that were entered into drug development programs.
- Designed and led the development of a cloud-based data infrastructure for harmonizing and storing genomic data.
- Led the development of machine learning algorithms to deconvolute gene expression data from bulk tissue into its constituent cell types.
Principal Scientist | R&D
Seven Bridges Genomics
- Developed novel bioinformatics algorithms for identifying different genomic patterns (see projects under the Experience section).
- Created a suite of quality control algorithms for production-grade analysis of whole-genome sequencing data.
- Built an algorithm for memoizing scientific data analysis workflows (see "Detection of Insufficient Homology Regions in a Reference Sequence" project under the Experience section).
Research Assistant | Bioinformatics
University of Helsinki
- Developed an early somatic exome sequencing pipeline for analyzing cancer samples.
- Performed somatic structural variation analysis using cancer whole-genome sequencing data during my master's degree project.
- Analyzed a gene expression microarray and somatic copy number data in cancer samples.
Experience
Algorithm for Detecting Repeated Genomic Regions
https://patents.google.com/patent/US20190214110A1/A patent for the method has been filed (see the link mentioned above).
Detection of Insufficient Homology Regions in a Reference Sequence
https://patents.google.com/patent/US10545792B2/I designed an algorithm for using a Merkle tree-like data structure to track the provenance of a workflow's intermediate files and final results. The memoization algorithm allows interrupting workflows to be rapidly restarted. Furthermore, uniquely identifying object hashes allow intermediate and final data files to be stored platform-wide, allowing redundant computational steps to be avoided in a completely transparent fashion.
Education
Ph.D in Bioinformatics and Cancer Genomics (Conferred by the University of Cambridge)
The Wellcome Sanger Institute - Cambridge, UK
Master's Degree in Bioinformatics
University of Helsinki - Helsinki, Finland
Skills
Industry Expertise
Bioinformatics, Transcriptomics
Languages
Python, R, Perl
Platforms
Linux
Other
Biology, Genomics, Computational Biology, Molecular Biology, Data Science, Data Analysis, Single-cell RNA Sequencing, CRISPR/Cas9, Machine Learning, Algorithms, Oncology & Cancer Treatment, Workflow, Statistical Analysis, Statistical Data Analysis, Statistical Modeling
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring