Zhuyi Xue, Machine Learning Developer in Vancouver, BC, Canada
Zhuyi Xue

Machine Learning Developer in Vancouver, BC, Canada

Member since March 5, 2015
Zhuyi is a skilled Python developer with over seven years of experience. He is also proficient in JavaScript and Scala. He has collected experience in a wide array of technologies with strong focuses on PyData stack, Django, and cloud computing. He is very detail-oriented and proactive, with great communication skills.
Zhuyi is now available for hire




Vancouver, BC, Canada



Preferred Environment

Tmux, Git, Emacs, Linux

The most amazing...

...web app I've built tracks usage from over eight supercomputers across Canada for my research group.


  • Computational Biologist

    2014 - 2019
    Genome Sciences Centre
    • Developed multiple pipelines for analyzing massive genomics data in Python.
    • Wrote tests with over 80% coverage for an analysis pipeline and developed a progress tracking dashboard in Django.
    • Leveraged a Google Cloud platform to deploy massive computation with 15,000 cores.
    Technologies: PyData, Google Cloud Platform (GCP), Django, Machine Learning
  • Lead Developer

    2014 - 2015
    • Developed both back-end using the WebApp2 framework and front-end in AngularJS.
    • Designed the database in Google Cloud Datastore.
    • Developed the daily task pipeline of fetching data from Google Search Console and exporting the data into Google Cloud Datastore and Google BigQuery tables.
    • Designed and built the front-end dashboard in NVD3.
    • Developed the build-deploy-test workflow in Gulp.
    Technologies: Google App Engine


  • Answer Set for Stanford CS229 (Development)

    Worked out and provided answers to all problem sets in one of the most popular machine learning courses online, CS229 by Stanford.

    90 GitHub stars

    1.5k views biweekly

  • Sutton-barto-rl-exercises (Development)

    Learning reinforcement learning by implementing the algorithms from reinforcement learning an introduction Edit

    30 GitHub stars

  • Ncbitax2lin (Development)

    Convert the whole NCBI taxonomy into lineages of all known organisms.

    For example, the taxonomic lineage of human beings is Eukaryota > Chordata > Mammalia > Primates > Hominidae > Homo > Homo sapiens.

    37 GitHub stars

  • SamFormat (Development)

    SamFormat interprets genomics code on the client side in a highly responsive manner.

    Google rank 2nd if searching for "sam flag" as of this entry.

  • RLjs (Development)

    Reinforcement learning algorithms implemented in JavaScript and React, demonstrated with Gridworld toy example.

  • TPR Parser (Development)

    TPR Parser closes a 5-year old feature-request ticket for MDAnalysis, a software package for analyzing molecular dynamics (MD) data in Python. TPR is the file that contains all the structural topology information and running parameters of a MD system in Gromacs encoded by XDR Standard (RFC 1014).

  • A Comprehensive Introduction To Your Genome With the SciPy Stack (Publication)
    Genome data is one of the most widely analyzed datasets in the realm of Bioinformatics. The SciPy stack offers a suite of popular Python packages designed for numerical computing, data transformation, analysis and visualization, which is ideal for many bioinformatic analysis needs. In this tutorial, Toptal Software Engineer Zhuyi Xue walks us through some of the capabilities of the SciPy stack. He also answers some interesting questions about the human genome, including: How much of the genome is incomplete? How long is a typical gene?


  • Languages

    Python, JavaScript, HTML, SQL, Scala, CSS
  • Frameworks

    webapp2, Scrapy, Django, Flask, Django REST Framework, Apache Spark, Bootstrap, Redux, AngularJS
  • Libraries/APIs

    Pandas, Scikit-learn, SciPy, Matplotlib, NumPy, TensorFlow, PyData, D3.js, Google Maps SDK, jQuery, React
  • Tools

    Emacs, Pytest, IPython, BigQuery, Git, Tmux
  • Platforms

    Jupyter Notebook, Google App Engine, Linux, Heroku, Google Cloud Platform (GCP), Google Cloud Engine, Amazon Web Services (AWS), Meteor
  • Industry Expertise

  • Storage

    Google Cloud Storage, Google Cloud Datastore, MySQL, PostgreSQL, MongoDB
  • Other

    Machine Learning, Software Development


  • Master of Science degree in Data Science
    2010 - 2013
    University of Toronto - Toronto


  • Google Cloud Certified Professional Data Engineer
    AUGUST 2018 - SEPTEMBER 2020
    Google Cloud

To view more profiles

Join Toptal
Share it with others