William Zhu, Developer in Shenzhen, Guangdong Province, China
William is available for hire
Hire William

William Zhu

Verified Expert  in Engineering

Data Scientist and AI Developer

Shenzhen, Guangdong Province, China

Toptal member since November 18, 2021

Bio

William has three years of professional experience in data science and artificial intelligence. Key projects include text classification to identify hate speech in social media and fraud detection applications. He specializes in data analysis, data visualization, and predictive modeling, and his strongest programming language is Python. William is diligent and obsessed with quality.

Portfolio

Koe Koe Tech
Python 3, NumPy, Pandas, Matplotlib, Scikit-learn, Flask, Python, Algorithms...
Koe Koe Tech
Python 3, NumPy, Pandas, Matplotlib, Scikit-learn, Python, Predictive Modeling...

Experience

  • Python 3 - 4 years
  • NumPy - 3 years
  • Pandas - 3 years
  • Matplotlib - 3 years
  • Scikit-learn - 2 years
  • Predictive Modeling - 2 years
  • Text Classification - 2 years
  • Natural Language Processing (NLP) - 2 years

Availability

Full-time

Preferred Environment

Linux, Vim Text Editor, Jupyter Notebook, Python 3

The most amazing...

...thing I've developed is the Python client of Myanmar Tools that is used by Google, Facebook, and others to detect Zawgyi encoding.

Work Experience

Lead AI Engineer

2019 - PRESENT
Koe Koe Tech
  • Led the development of algorithms to detect hate speech on social media.
  • Built and maintained a web API for the hate speech detector.
  • Provided technical assistance in labeling hate speech.
Technologies: Python 3, NumPy, Pandas, Matplotlib, Scikit-learn, Flask, Python, Algorithms, APIs, Data Science, Artificial Intelligence (AI), Text Classification, Jupyter Notebook, Vim Text Editor

Data Scientist

2019 - 2019
Koe Koe Tech
  • Analyzed data from different data sources for regular and ad hoc reporting.
  • Performed performance tuning and documentation of 10+ tables in a database.
  • Built a predictive model of referral fraud based on user behavior.
Technologies: Python 3, NumPy, Pandas, Matplotlib, Scikit-learn, Python, Predictive Modeling, Performance Tuning, Databases, Data Reporting, Data Science, Jupyter Notebook, Vim Text Editor

Experience

Hate Speech Detector for Social Media Comments

An NLP algorithm that detects hate speech in comments on social media. The training data is mostly in Burmese, so the project involves research on Burmese NLP, where the challenging part is tokenization. I led the development of the algorithm and worked with data engineers and web developers to integrate it into a social media monitoring app through a web API.

Referral Fraud Detector for a Health App

An algorithm that detected referral fraud for a health app based on user behavior. I proposed the project to deal with fraud and worked with other data scientists and the marketing team to develop the algorithm. It was used briefly to check payments and later replaced with another solution.

Python Client for Myanmar Tools

https://github.com/google/myanmar-tools/tree/master/clients/python
Myanmar Tools is an open-source project created by a Google engineer to solve encoding issues in Myanmar. The main feature is a detector of Zawgyi encoding, which is a Unicode alternative used in Myanmar. I ported the client code from Java to Python and made the package available on PyPI.

Education

2009 - 2016

Bachelor's Degree in Medicine

University of Medicine, Mandalay - Mandalay, Myanmar

Certifications

DECEMBER 2017 - PRESENT

Learning from Data (Introductory Machine Learning) (CS115x)

edX

MAY 2017 - PRESENT

Introduction to Probability - The Science of Uncertainty (6.041x)

edX

MARCH 2017 - PRESENT

Introduction to Computer Science (CS50)

edX

Skills

Libraries/APIs

NumPy, Pandas, Matplotlib, Scikit-learn, Web API

Tools

Vim Text Editor, PyPI

Languages

Python 3, Python, SQL, C, JavaScript

Platforms

Linux, Jupyter Notebook

Frameworks

Flask

Storage

Databases

Other

Predictive Modeling, Data Science, Medicine, Algorithms, APIs, Performance Tuning, Data Reporting, Artificial Intelligence (AI), Text Classification, Tokenization, Natural Language Processing (NLP), Computer Science, Data Structures, Probability Theory, Statistics, Machine Learning, Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring