Yen Lok, Developer in Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia
Yen is available for hire
Hire Yen

Yen Lok

Verified Expert  in Engineering

Data Scientist and Machine Learning Developer

Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia

Toptal member since June 30, 2022

Bio

Yen is a data scientist, a Python enthusiast, and a machine learning (ML) developer. He has been a data scientist in several large multinational companies. Yen has developed data pipelines and machine learning (ML) models. His unique combination of domain knowledge in business, science, and technical skills allows him to provide his clients with actionable, predictive, and explanatory models.

Portfolio

Freelance
Artificial Intelligence (AI), BigQuery, Data Analytics, Python 3, Docker...
AirAsia Group Berhad
Python 3, Machine Learning, Big Data, Google Cloud Platform (GCP), SQL...
Neural Technologies
Python 3, Machine Learning, Deep Learning, Analytics, GRAPH, Python...

Experience

  • Python - 5 years
  • Time Series Analysis - 5 years
  • Machine Learning - 5 years
  • Data Science - 5 years
  • Data Modeling - 4 years
  • Data Analysis - 4 years
  • Financial Markets - 3 years
  • Data Analytics - 3 years

Availability

Part-time

Preferred Environment

PyCharm, Slack, Python 3, SQL, Google Cloud Platform (GCP), Machine Learning, Artificial Intelligence (AI), Analytics, Time Series Analysis, Predictive Modeling

The most amazing...

...thing I've developed is a sales propensity model for campaign ad-targeting, achieving an uplift of over four times.

Work Experience

Independent Data Science Consultant

2022 - PRESENT
Freelance
  • Built streaming data pipelines to index blockchain smart contracts via Pub/Sub and Apache Beam.
  • Optimized and trained DL models such as LSTM and Time2Vec transformer models to make time-series predictions for a ski tracking app.
  • Engineered features and trained ML models for multiple use cases, such as predicting the churn rate for an insurance product, the likelihood of an employee changing jobs within a year, etc.
  • Built and deployed a dynamic pricing model for an eCommerce product using GA data.
  • Designed and automated a dashboard in Google Data Studio, including data preparation automation in BigQuery/Python and web scraping for external data sources.
  • Deployed content generation APIs using the OpenAI GPT3 model and text-to-image models, e.g., the DALL·E model.
Technologies: Artificial Intelligence (AI), BigQuery, Data Analytics, Python 3, Docker, Google Cloud Platform (GCP), Machine Learning, Time Series Analysis, Deep Learning, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Pandas, NumPy, Data Scraping, Data Engineering, Data Pipelines, Predictive Analytics, Dashboards, Quantitative Analysis, REST APIs, TensorFlow, Keras, Python, Amazon Web Services (AWS), Scikit-learn, Data Visualization, Google BigQuery, Data Analysis, Data Modeling, Financial Forecasting, Computational Finance, Image Generation

Data Scientist

2021 - 2022
AirAsia Group Berhad
  • Developed an automatic machine learning (ML) propensity model for marketing ad-targeting.
  • Applied time series analytics to determine the department’s budget allocation.
  • Built fraud detection system to identify account take-over fraud.
  • Constructed a central feature store in BigQuery for predictive modelling.
  • Built an app banner recommendation model to improve banner click-through rate.
Technologies: Python 3, Machine Learning, Big Data, Google Cloud Platform (GCP), SQL, Recommendation Systems, Python, Docker, APIs, Data Science, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Natural Language Toolkit (NLTK), SpaCy, Pandas, NumPy, GIS, Data Scraping, Jupyter Notebook, Automation, Scikit-learn, Data Engineering, Data Warehousing, Data Architecture, ETL, Data Pipelines, ETL Tools, Infrastructure as Code (IaC), Predictive Analytics, Dashboards, Quantitative Analysis, Quantitative Modeling, REST APIs, Google BigQuery, Data Analysis, Data Modeling, Marketing

Machine Learning (ML) Engineer

2020 - 2021
Neural Technologies
  • Utilized a mix of supervised, unsupervised, and categorical modeling techniques to deliver the best predictive performance.
  • Built a full suite of customized plots to analyze and interpret model outcomes.
  • Set up the application of full software development life cycles. Included code reviews, version management, build model, and testing.
Technologies: Python 3, Machine Learning, Deep Learning, GRAPH, Analytics, Python, Data Science, Pandas, NumPy, Jupyter Notebook, Automation, Scikit-learn, Predictive Analytics, Quantitative Analysis, Quantitative Modeling, REST APIs, TensorFlow, Keras, Neo4j, Risk Analysis

Data Scientist

2019 - 2020
AIA Group
  • Performed a deep-dive analysis on company vitality using pseudo control matching to identify actionable insights.
  • Managed profile vitality member behavior using graph sequencing and segmentation techniques to improve vitality member engagement rate.
  • Developed reimbursement claims and a straight-through process using predictive analytics to improve claims settlement turn-around time.
  • Built an anomaly detection model to identify outlier claims to aid the team in devising strategies to reduce overall claim costs.
Technologies: Python 3, Machine Learning, Analytics, Data Analytics, Predictive Modeling, SQL, Python, Data Science, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Natural Language Toolkit (NLTK), SpaCy, Pandas, NumPy, Data Scraping, Jupyter Notebook, Automation, Scikit-learn, Data Engineering, Data Warehousing, Data Architecture, ETL, Data Pipelines, ETL Tools, Predictive Analytics, Dashboards, Quantitative Analysis, Quantitative Modeling, Data Visualization, Microsoft Power BI, Data Analysis, Data Modeling, Marketing

Data Scientist

2018 - 2019
Logistic Worldwide Express
  • Extracted, manipulated, and interpreted data from different data sources, such as the local postal service address database and OpenStreetMap data, to create data pipelines for building dashboards.
  • Developed Power BI dashboards to assist the management team in tracking business performances and making data-driven decisions.
  • Applied graph algorithms using Neo4j to optimize shipment routing, with an estimated delivery time savings of 10% to 20%.
Technologies: Python, Neo4j, Python 3, SQL, Machine Learning, Artificial Intelligence (AI), Microsoft Power BI, Data Analysis, Data Modeling

Post-doctoral Researcher in Time Series ML

2017 - 2018
Aberdeen Standard Investments
  • Applied time-series modeling and ML techniques to improve the multi-asset investment decision process.
  • Built a backtesting framework for ML using MATLAB.
  • Developed a novel approach to select predictive models.
Technologies: MATLAB, Machine Learning, Time Series Analysis, Research, Financial Markets, Data Science, Quantitative Analysis, Quantitative Finance, Quantitative Modeling, Quantitative Risk Analysis, Financial Forecasting, Risk Analysis, Computational Finance

Experience

Auto Machine Learning (ML) Propensity Model

I developed an end-to-end predictive pipeline to optimize marketing ad-targeting that has achieved over four times the standard model uplift. It includes the following components:

• A feature store to cater to the data requirements of this and many other predictive models.
• A model validation framework allows us to track past and ongoing model performance.
• A scheduled model re-training to reduce the effects of data drift.
• Model deployment via Dataflow in Google Cloud Platform (GCP).

Banner Recommender System

I was the developer of a banner recommendation system in an eCommerce app. The code was built in GCP; it derived banner features and then matched them to users based on their behavior. It was deployed as a Docker container in Google Cloud Run to ensure scalability.

Fraud Detection System

I was the machine learning engineer who created a fraud detection system for the telecommunications industry. I utilized a mix of supervised, unsupervised, and categorical modeling techniques to deliver the system. I also applied Explainable AI to interpret the model outcome built and used Matplotlib to build a set of customized plots to display the results.

Education

2013 - 2017

Ph.D. Degree in Actuarial Mathematics

Heriot-Watt University - Edinburgh, United Kingdom

2012 - 2013

Master's Degree in Actuarial Science

Heriot-Watt University - Edinburgh, United Kingdom

2007 - 2010

Bachelor's Degree in Actuarial Science

University of Kent - Canterbury, United Kingdom

Skills

Libraries/APIs

Pandas, NumPy, Scikit-learn, Natural Language Toolkit (NLTK), SpaCy, REST APIs, TensorFlow, Keras, Matplotlib

Tools

BigQuery, Microsoft Power BI, PyCharm, MATLAB, Apache Beam, Excel 2013, Amazon SageMaker, GIS, AWS Glue

Languages

Python 3, SQL, Python, R

Paradigms

Automation, ETL

Platforms

Jupyter Notebook, Google Cloud Platform (GCP), RStudio, Docker, Amazon Web Services (AWS)

Storage

Data Pipelines, Google Cloud, Neo4j

Industry Expertise

Marketing

Frameworks

Flask

Other

Machine Learning, Artificial Intelligence (AI), Time Series Analysis, Predictive Modeling, Data Analytics, Data Science, Data Engineering, Data Warehousing, Data Architecture, ETL Tools, Predictive Analytics, Google BigQuery, Statistics, Financial Modeling, Mathematics, Research, Financial Markets, Deep Learning, Big Data, Recommendation Systems, Natural Language Processing (NLP), Data Scraping, Data Visualization, Infrastructure as Code (IaC), Dashboards, Quantitative Analysis, Quantitative Modeling, Data Analysis, Data Modeling, Financial Forecasting, Risk Analysis, Statistical Modeling, Generative Pre-trained Transformers (GPT), Computational Finance, Risk Models, Enterprise Risk Management (ERM), Stochastic Modeling, Probability Theory, Bayesian Inference & Modeling, APIs, Algorithmic Trading, Algorithms, Kalman Filtering, Explainable Artificial Intelligence (XAI), Quantitative Finance, Quantitative Risk Analysis, Analytics, GRAPH, Image Generation

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring