Scroll To View More
Hire the top 3% of freelance developers
Zhuyi Xue

Zhuyi Xue

Vancouver, BC, Canada
Member since March 5, 2015
Zhuyi is a skilled Python developer with over seven years of experience. He is also proficient in JavaScript and Scala. He has collected experience in a wide array of technologies with strong focuses on PyData stack, Django, and cloud computing. He is very detail-oriented and proactive, with great communication skills.
Zhuyi is now available for hire
Portfolio
Experience
  • Python, 7 years
  • Bioinformatics, 5 years
  • Django, 4 years
  • Pandas, 4 years
  • Scrapy, 4 years
  • Google App Engine, 3 years
  • webapp2, 3 years
  • Django REST Framework, 2 years
Vancouver, BC, Canada
Availability
Part-time
Preferred Environment
Linux, Emacs, Git, Screen
The most amazing...
...web app I've built tracks usage from over eight supercomputers across Canada for my research group.
Employment
  • Data Scientist
    Canada's Michael Smith Genome Sciences Centre
    2014 - PRESENT
    • Developed multiple pipelines for analyzing massive genomics data in Python.
    • Wrote tests with over 80% coverage for an analysis pipeline and developed a progress tracking dashboard in Django.
    • Created a web application for interpreting genomics code in AngularJS.
    • Leveraged Google Cloud Platform to deploy massive computation with 15k cores running across three continents.
    Technologies: Machine Learning, PyData Stack, Django, Google Cloud Platform
  • Lead Developer
    TotalWebmaster
    2014 - 2015
    • Developed both back-end using the WebApp2 framework and front-end in AngularJS.
    • Designed the database in Google Cloud Datastore.
    • Developed the daily task pipeline of fetching data from Google Search Console and exporting the data into Google Cloud Datastore and Google BigQuery tables.
    • Designed and built front-end dashboard in NVD3.
    • Developed the build-deploy-test workflow in Gulp.
    Technologies: Google App Engine
Experience
  • Answer Set for Stanford CS229 (Development)
    https://github.com/zyxue/stanford-cs229/

    Worked out and provided answers to all problem sets in one of the most popular machine learning courses online, CS229 by Stanford.

    10 GitHub stars

    1.2k views biweekly

  • Ncbitax2lin (Development)
    https://github.com/zyxue/ncbitax2lin

    Convert the whole NCBI taxonomy into lineages of all known organisms.

    For example, the taxonomic lineage of human beings is Eukaryota > Chordata > Mammalia > Primates > Hominidae > Homo > Homo sapiens.

    27 GitHub stars

  • RLjs (Development)
    https://rljs.herokuapp.com/

    Reinforcement learning algorithms implemented in JavaScript and React, demonstrated with Gridworld toy example.

  • Sutton-barto-rl-exercises (Development)
    https://github.com/zyxue/sutton-barto-rl-exercises

    Learning reinforcement learning by implementing the algorithms from reinforcement learning an introduction Edit

    17 GitHub stars

  • SamFormat (Development)
    http://www.samformat.info/

    SamFormat interprets genomics code on the client side in a highly responsive manner.

    Google rank 2nd if searching for "sam flag" as of this entry.

  • Rsempipeline (Development)
    https://github.com/bcgsc/rsempipeline

    Rsempipeline analyzes massive RNA-Seq data from Gene Expression Omnibus (GEO) database in an efficient and economical manner. It takes care of communications between a localhost and a remote HPC as well as adaptive resource consumption automatically. It also comes with a web-based visualization component for monitoring analysis progress.

  • TPR Parser (Development)
    https://github.com/MDAnalysis/mdanalysis/blob/master/package/MDAnalysis/topology/tpr/utils.py

    TPR Parser closes a 5-year old feature-request ticket for MDAnalysis, a software package for analyzing molecular dynamics (MD) data in Python. TPR is the file that contains all the structural topology information and running parameters of a MD system in Gromacs encoded by XDR Standard (RFC 1014).

  • A Comprehensive Introduction To Your Genome With the SciPy Stack (Publication)
    Genome data is one of the most widely analyzed datasets in the realm of Bioinformatics. The SciPy stack offers a suite of popular Python packages designed for numerical computing, data transformation, analysis and visualization, which is ideal for many bioinformatic analysis needs. In this tutorial, Toptal Software Engineer Zhuyi Xue walks us through some of the capabilities of the SciPy stack. He also answers some interesting questions about the human genome, including: How much of the genome is incomplete? How long is a typical gene?
Skills
  • Languages
    JavaScript, Python, SQL, HTML, Scala, CSS
  • Frameworks
    webapp2, Scrapy, Django, Machine Learning, Google Cloud Engine, Flask, Bootstrap, Apache Spark, Django REST Framework, Redux, AngularJS
  • Libraries/APIs
    Pandas, NumPy, Scikit-learn, SciPy, Matplotlib, TensorFlow, React, jQuery, D3.js, Google Maps SDK
  • Tools
    Git, pytest, IPython, BigQuery, Emacs
  • Platforms
    Google App Engine, Google Cloud Platform, Jupyter Notebook, Linux, Heroku, Amazon Web Services (AWS), Meteor
  • Storage
    Google Cloud Storage, Google Cloud Datastore, PostgreSQL, MySQL, Google Cloud, MongoDB
  • Other
    Bioinformatics
Education
  • Master of Science degree in Computational Biology
    University of Toronto - Toronto
    2010 - 2013
Hire the top 3% of freelance developers
I really like this profile
Share it with others