Khoa Nguyen, Developer in Melbourne, Victoria, Australia
Khoa is available for hire
Hire Khoa

Khoa Nguyen

Verified Expert  in Engineering

Data Scientist Developer

Location
Melbourne, Victoria, Australia
Toptal Member Since
July 1, 2021

Khoa is a data scientist specializing in providing businesses with high-quality machine-learning solutions. He successfully helped deploy AI modules that assisted many advertising campaigns in optimizing their marketing strategies. He also contributed to several PoC projects outlining their feasibility of solving practical business problems. Khoa is also a well-rounded individual who can collaborate with many people at work and work independently.

Portfolio

Captario
Python, Azure, Kubernetes, Pandas, Dask
Yedda Co. Ltd.
UML Diagrams, Visual Studio Code (VS Code)
Knorex Co., Ltd.
Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda...

Experience

Availability

Part-time

Preferred Environment

TensorFlow, Pandas, NumPy, SQL, Jupyter Notebook, Machine Learning, Python, Artificial Intelligence (AI), Data Mining, PyTorch

The most amazing...

...research I have done was using Artificial Intelligence prediction models to understand the interaction between cross-reactive T-cells and various pathogens.

Work Experience

Python Engineer

2021 - 2022
Captario
  • Optimized CPU and memory utilization for drug database infrastructure.
  • Developed Python code to optimize model drug projects.
  • Distributed computations using cloud infrastructure and Kubernetes.
Technologies: Python, Azure, Kubernetes, Pandas, Dask

Data Engineer

2021 - 2021
Yedda Co. Ltd.
  • Developed a module that manages data collection and database management for customers.
  • Provided UML diagrams and solutions for database architecture.
  • Performed research to optimize SQL queries and database performance.
Technologies: UML Diagrams, Visual Studio Code (VS Code)

Data Scientist

2020 - 2021
Knorex Co., Ltd.
  • Developed bid landscape model for predicting optimal bidding prices in real-time advertising display with a ROC AUC score of up to 80%.
  • Implemented the first version of the audience segmentation module to identify similar user groups for the ads targeting scheme.
  • Provided a comprehensive proof-of-concept (POC) of federated learning in the predicting CTR of online advertising (trade-off with a decrease of AUC score by 15-20% while ensuring data privacy).
  • Provided a machine learning architecture handling training and serving sessions of bid landscape for multiple advertising campaigns.
  • Provided a proof-of-concept of feature store using Feast to prepare a better data source for data scientists in extracting qualities features.
  • Analyzed data to extract relevant information for customers to improve their marketing strategies.
Technologies: Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda, Data Analysis, Multiprocessing, Feast, Google Cloud, Visual Studio Code (VS Code), Machine Learning, Neural Networks

AI Engineer

2019 - 2020
Viralint Co. Ltd.
  • Analyzed metadata and lyrics of different songs to determine the user's configuration for optimal song generation.
  • Built a data system for crawling and labeling data for music generation.
  • Created a deep learning model for lyrics segmentation and semantic analysis.
  • Designed a multiprocessing system for deep learning tasks.
  • Provided a PoC of a module automatically detecting faults in the lens using OpenCV.
Technologies: Python 3, OpenCV, Boost.Python, TensorFlow, Pandas, NumPy, PyTorch, Visual Studio Code (VS Code), Machine Learning, Neural Networks, Computer Vision

Data Engineer Intern

2018 - 2018
Younet Media Social Enterprise
  • Supported building a database of user's social network information.
  • Implemented artificial intelligence models detecting human faces and ages.
  • Assisted in building modules to extract data from Facebook API.
Technologies: Python 3, TensorFlow, Visual Studio Code (VS Code)

Artificial Intelligence to Predict How T-cells Recognize Diverse Pathogens

https://github.com/crimsonthinker/crimson_research
A research project to understand existing ML model performance in predicting how T-cells recognize pathogens. I developed a framework that improves current models by generating precise data for training. The framework also acknowledges the existence of cross-reactive T-cells, those providing immunity for humans against multiple pathogens in similar structures.

Smart Bid Recommendation of Knorex's KAIROS Engine

An automated bidding price module for ad slot auctions written in Python. I took charge of preprocessing the dataset for training, integrating and improving the machine learning model for the bidding price forecasting tasks, and providing complete training and serving architecture for production. The purpose of this module is to help suggest an optimal bidding price for different ad slot auctions while ensuring that the number of ad slots for their creative displays is as high as possible.

POC of Federated Learning for CTR Prediction

A proof of concept (POC) project assessing the capability of federated learning in preserving data privacy while training neural networks. I took charge of performing research and conducting the experiment to ensure that the performance of neural networks is not compromised by exposure of data between organizations.

Automatically Finding the Number of Clusters for Large Datasets Based on Coresets

https://dl.acm.org/doi/10.1145/3421537.3421538
I was one of the research team members and other members who investigated a visual-based method for automatically determining the number of existing clusters in the dataset. Results show that the approach effectively determines the correct number of clusters and recognizes irregular-shaped clusters during estimation.

Languages

C++, Python, Python 3, SQL, R, Java, Scala

Libraries/APIs

TensorFlow, Pandas, NumPy, OpenCV, PyTorch, Matplotlib, TensorFlow Deep Learning Library (TFLearn), Keras, Scikit-learn, Dask

Platforms

Visual Studio Code (VS Code), Anaconda, Jupyter Notebook, Azure, Kubernetes

Other

Machine Learning, Data Mining, Programming, Predictive Analytics, Predictive Modeling, Boost.Python, Artificial Intelligence (AI), Deep Learning, Linear Algebra, Big Data, Data Analysis, Multiprocessing, Data Preparation, Exploratory Data Analysis, Data Visualization, Streaming, Ray, Neural Networks, Computer Vision, Statistics, Feast, Calculus, UML Diagrams, Signal Processing, Biology, Research, Algorithms

Tools

Plotly, Scikit-image

Paradigms

Data Science

Storage

Google Cloud, Databases

2022 - 2023

Master's Degree in Computer Science

University of Melbourne - Melbourne, VIC

2016 - 2020

Engineer's Degree in Computer Science

Ho Chi Minh University of Technology, Vietnam National University - Vietnam

APRIL 2020 - PRESENT

Advanced Data Science with IBM Specialization

Coursera

SEPTEMBER 2018 - PRESENT

Global Project-based Learning

Shibaura Institute of Technology and Ho Chi Minh University of Technology

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring