Khoa Nguyen, Data Scientist Developer in Ho Chi Minh City, Ho Chi Minh, Vietnam
Khoa Nguyen

Data Scientist Developer in Ho Chi Minh City, Ho Chi Minh, Vietnam

Member since July 1, 2021
Khoa is a data scientist who specializes in providing high-quality machine learning solutions to businesses. Specifically, he successfully helped deployed AI modules that assisted many advertising campaigns in optimizing their marketing strategies, and he also contributes several PoC projects outlining the feasibility of them to solve practical business problems. Khoa is also a well-round individual who can collaborate with many people at work, as well as work independently.
Khoa is now available for hire

Portfolio

  • Yedda Co. Ltd.
    PostgreSQL, Go, Software Engineering, UML Diagrams, Scrum, Visual Studio Code...
  • Knorex Co., Ltd.
    Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda...
  • Viralint Co. Ltd.
    Python 3, OpenCV, Boost.Python, TensorFlow, Pandas, NumPy, PyTorch...

Experience

Location

Ho Chi Minh City, Ho Chi Minh, Vietnam

Availability

Part-time

Preferred Environment

TensorFlow, Pandas, NumPy, SQL, Jupyter Notebook, Visual Studio Code, Machine Learning, Python, Artificial Intelligence (AI), Data Mining

The most amazing...

...project I have developed is an automated bid landscape module of Knorex's KAIROS machine learning engine for digital advertising.

Employment

  • Data Engineer

    2021 - 2021
    Yedda Co. Ltd.
    • Developed a module that manages data collection and database management for customers.
    • Provided UML diagrams and solutions for database architecture.
    • Performed research to optimize SQL queries and database performance.
    Technologies: PostgreSQL, Go, Software Engineering, UML Diagrams, Scrum, Visual Studio Code, Back-end
  • Data Scientist

    2020 - 2021
    Knorex Co., Ltd.
    • Predicted optimal bidding prices in a bid landscape model for real-time bidding display advertising with an AUC score of up to 80%.
    • Implemented the first version of the audience segmentation module to identify similar user groups for the ads targeting scheme.
    • Provided a comprehensive proof-of-concept (POC) of federated learning in the predicting CTR of online advertising (trade-off with a decrease of AUC score by 15-20% while ensuring data privacy).
    • Provided a machine learning architecture handling training and serving sessions of bid landscape for multiple advertising campaigns.
    • Provided a proof-of-concept of feature store using Feast to prepare a better data source for data scientists in extracting qualities features.
    • Analyzed data to extract relevant information for customers to improve their marketing strategies.
    Technologies: Python 3, Pandas, TensorFlow, Plotly, Matplotlib, Data Mining, SQL, Anaconda, Data Analysis, Multiprocessing, Feast, Google Cloud, Visual Studio Code, PyMongo, Machine Learning, Neural Networks, MongoDB, Media, Back-end
  • AI Engineer

    2019 - 2020
    Viralint Co. Ltd.
    • Analyzed metadata and lyrics of different songs to determine the user's configuration for optimal song generation.
    • Built a data system for crawling and labeling data for music generation.
    • Created a deep learning model for lyrics segmentation and semantic analysis.
    • Designed a multiprocessing system for deep learning tasks.
    • Provided a PoC of a module automatically detecting faults in the lens using OpenCV.
    Technologies: Python 3, OpenCV, Boost.Python, TensorFlow, Pandas, NumPy, PyTorch, Visual Studio Code, Machine Learning, Neural Networks, Computer Vision, Back-end
  • Data Engineer Intern

    2018 - 2018
    Younet Media Social Enterprise
    • Supported building a database of user's social network information.
    • Implemented artificial intelligence models detecting human faces and ages.
    • Assisted in building modules to extract data from Facebook API.
    Technologies: Python 3, Flask, TensorFlow, Visual Studio Code, Back-end

Experience

  • Smart Bid Recommendation of Knorex's KAIROS Engine

    An automated bidding price module for ad slot auctions written in Python. I took charge of preprocessing the dataset for training, integrating and improving the machine learning model for the bidding price forecasting tasks, and providing complete training and serving architecture for production. The purpose of this module is to help suggest an optimal bidding price for different ad slot auctions while ensuring that the number of ad slots for their creative displays is as high as possible.

  • Automatically Finding the Number of Clusters for Large Datasets Based on Coresets
    https://www.researchgate.net/publication/346101078_Automatically_Finding_the_Number_of_Clusters_for_Large_Datasets_based_on_Coresets

    I was one of the research team members and other members who investigated a visual-based method for automatically determining the number of existing clusters in the dataset. Results show that the approach effectively determines the correct number of clusters and recognizes irregular-shaped clusters during estimation.

  • POC of Federated Learning for CTR Prediction

    A proof of concept (POC) project assessing the capability of federated learning in preserving data privacy while training neural networks. I took charge of performing research and conducting the experiment to ensure that the performance of neural networks is not compromised by exposure of data between organizations.

Skills

  • Languages

    C++, Python, Python 3, SQL, R, Go, Java, Scala
  • Libraries/APIs

    Pandas, NumPy, TensorFlow, OpenCV, PyMongo, Matplotlib, PyTorch
  • Platforms

    Visual Studio Code, Anaconda, Jupyter Notebook
  • Other

    Machine Learning, Data Mining, Data Structures, Programming, Predictive Analytics, Predictive Modeling, Boost.Python, Artificial Intelligence (AI), Deep Learning, Linear Algebra, Big Data, Software Engineering, Data Analysis, Multiprocessing, Data Preparation, Exploratory Data Analysis, Data Visualization, Data Engineering, Streaming, Ray, Neural Networks, Computer Vision, Back-end, Statistics, Feast, Calculus, UML Diagrams, Signal Processing
  • Tools

    Plotly, Scikit-image, Apache Airflow
  • Paradigms

    Data Science, Scrum
  • Storage

    Google Cloud, MongoDB, PostgreSQL
  • Frameworks

    Flask, gRPC

Education

  • Engineer's Degree in Computer Science
    2016 - 2020
    Ho Chi Minh University of Technology, Vietnam National University - Vietnam

Certifications

  • Advanced Data Science with IBM Specialization
    APRIL 2020 - PRESENT
    Coursera
  • Global Project-based Learning
    SEPTEMBER 2018 - PRESENT
    Shibaura Institute of Technology and Ho Chi Minh University of Technology

To view more profiles

Join Toptal
Share it with others