Khoa Nguyen, Developer in Ho Chi Minh City, Ho Chi Minh, Vietnam
Khoa is available for hire
Hire Khoa

Khoa Nguyen

Verified Expert  in Engineering

Data Scientist Developer

Ho Chi Minh City, Ho Chi Minh, Vietnam
Toptal Member Since
July 1, 2021

Khoa is a data scientist who specializes in providing high-quality machine learning solutions to businesses. Specifically, he successfully helped deployed AI modules that assisted many advertising campaigns in optimizing their marketing strategies, and he also contributes several PoC projects outlining the feasibility of them to solve practical business problems. Khoa is also a well-round individual who can collaborate with many people at work, as well as work independently.


Yedda Co. Ltd.
UML Diagrams, Visual Studio Code (VS Code)
Knorex Co., Ltd.
Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda...
Viralint Co. Ltd.
Python 3, OpenCV, Boost.Python, TensorFlow, Pandas, NumPy, PyTorch...




Preferred Environment

TensorFlow, Pandas, NumPy, SQL, Jupyter Notebook, Visual Studio Code (VS Code), Machine Learning, Python, Artificial Intelligence (AI), Data Mining

The most amazing...

...project I have developed is an automated bid landscape module of Knorex's KAIROS machine learning engine for digital advertising.

Work Experience

Data Engineer

2021 - 2021
Yedda Co. Ltd.
  • Developed a module that manages data collection and database management for customers.
  • Provided UML diagrams and solutions for database architecture.
  • Performed research to optimize SQL queries and database performance.
Technologies: UML Diagrams, Visual Studio Code (VS Code)

Data Scientist

2020 - 2021
Knorex Co., Ltd.
  • Predicted optimal bidding prices in a bid landscape model for real-time bidding display advertising with an AUC score of up to 80%.
  • Implemented the first version of the audience segmentation module to identify similar user groups for the ads targeting scheme.
  • Provided a comprehensive proof-of-concept (POC) of federated learning in the predicting CTR of online advertising (trade-off with a decrease of AUC score by 15-20% while ensuring data privacy).
  • Provided a machine learning architecture handling training and serving sessions of bid landscape for multiple advertising campaigns.
  • Provided a proof-of-concept of feature store using Feast to prepare a better data source for data scientists in extracting qualities features.
  • Analyzed data to extract relevant information for customers to improve their marketing strategies.
Technologies: Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda, Data Analysis, Multiprocessing, Feast, Google Cloud, Visual Studio Code (VS Code), Machine Learning, Neural Networks

AI Engineer

2019 - 2020
Viralint Co. Ltd.
  • Analyzed metadata and lyrics of different songs to determine the user's configuration for optimal song generation.
  • Built a data system for crawling and labeling data for music generation.
  • Created a deep learning model for lyrics segmentation and semantic analysis.
  • Designed a multiprocessing system for deep learning tasks.
  • Provided a PoC of a module automatically detecting faults in the lens using OpenCV.
Technologies: Python 3, OpenCV, Boost.Python, TensorFlow, Pandas, NumPy, PyTorch, Visual Studio Code (VS Code), Machine Learning, Neural Networks, Computer Vision

Data Engineer Intern

2018 - 2018
Younet Media Social Enterprise
  • Supported building a database of user's social network information.
  • Implemented artificial intelligence models detecting human faces and ages.
  • Assisted in building modules to extract data from Facebook API.
Technologies: Python 3, TensorFlow, Visual Studio Code (VS Code)

Smart Bid Recommendation of Knorex's KAIROS Engine

An automated bidding price module for ad slot auctions written in Python. I took charge of preprocessing the dataset for training, integrating and improving the machine learning model for the bidding price forecasting tasks, and providing complete training and serving architecture for production. The purpose of this module is to help suggest an optimal bidding price for different ad slot auctions while ensuring that the number of ad slots for their creative displays is as high as possible.

Automatically Finding the Number of Clusters for Large Datasets Based on Coresets

I was one of the research team members and other members who investigated a visual-based method for automatically determining the number of existing clusters in the dataset. Results show that the approach effectively determines the correct number of clusters and recognizes irregular-shaped clusters during estimation.

POC of Federated Learning for CTR Prediction

A proof of concept (POC) project assessing the capability of federated learning in preserving data privacy while training neural networks. I took charge of performing research and conducting the experiment to ensure that the performance of neural networks is not compromised by exposure of data between organizations.


C++, Python, Python 3, SQL, R, Java, Scala


TensorFlow, Pandas, NumPy, OpenCV, PyTorch, Matplotlib, TensorFlow Deep Learning Library (TFLearn), Keras


Visual Studio Code (VS Code), Anaconda, Jupyter Notebook


Machine Learning, Data Mining, Programming, Predictive Analytics, Predictive Modeling, Boost.Python, Artificial Intelligence (AI), Deep Learning, Linear Algebra, Big Data, Data Analysis, Multiprocessing, Data Preparation, Exploratory Data Analysis, Data Visualization, Streaming, Ray, Neural Networks, Computer Vision, Statistics, Feast, Calculus, UML Diagrams, Signal Processing


Plotly, Scikit-image


Data Science


Google Cloud

2016 - 2020

Engineer's Degree in Computer Science

Ho Chi Minh University of Technology, Vietnam National University - Vietnam


Advanced Data Science with IBM Specialization



Global Project-based Learning

Shibaura Institute of Technology and Ho Chi Minh University of Technology