Joslyn Lim, Developer in Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia
Joslyn is available for hire
Hire Joslyn

Joslyn Lim

Verified Expert  in Engineering

Data Developer

Location
Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia
Toptal Member Since
September 28, 2022

Joslyn is a seasoned data practitioner with demonstrated experience across multiple industries, including technology consulting and customer service. With her academic background in applied statistics and a skillset in machine learning, data analytics, Python, and SQL, Joslyn has delivered numerous projects with positive business impacts on customers.

Portfolio

Dattel
Agile, Amazon Machine Learning, Amazon QuickSight, Flutter, .NET 3...
Kognitiv
Python 3, PostgreSQL, Tableau, Microsoft SQL Server, Bitbucket, Jira, FastAPI...
Stop the Traffik
Amazon SageMaker, Python 3, PyTorch, Docker, IBM Cloud, MongoDB, Programming...

Experience

Availability

Part-time

Preferred Environment

Visual Studio Code (VS Code), Python 3, SQL

The most amazing...

...project I've worked on was building a data lake, data warehouse, dashboards, and four machine learning use cases within 12 months with over 20 team members.

Work Experience

Chief Technology Officer

2023 - PRESENT
Dattel
  • Led the product development for an ad copy and ad creative generator with a team of five using GPT3.5 and stable diffusion. The MVP was showcased in a regional marketing event which brought in around 100 usage in the first week.
  • Built the IT policy and data governance framework from scratch with GCR personnel to safeguard company assets from cyber threats.
  • Upgraded the development cycle for both the engineering and data teams to adopt a CI/CD practice, resulting in approximately 20% development time saved at the first stage.
Technologies: Agile, Amazon Machine Learning, Amazon QuickSight, Flutter, .NET 3, Software Architecture, Product Consultant, Marketing, Digital Advertising, Strategy, OpenAI

Data Science Manager

2023 - 2023
Kognitiv
  • Performed pre and post campaign analysis for both ATL and BTL campaigns using SQL, Python and R to understand the feasibility and effectiveness of the campaigns which were able to achieve an average uplift of 2-3% during non-festive seasons.
  • Led the analytics team in migration and integration tests post-implementation of data and machine learning model migration for regional clients using Python, SQL, and X-Ray, which saved the effort for up to 2 FTE (manual tester).
  • Advised both internal and external stakeholders on data analytics methodologies, campaign A/B testing, and technology architecture to achieve desirable outcomes.
Technologies: Python 3, PostgreSQL, Tableau, Microsoft SQL Server, Bitbucket, Jira, FastAPI, Unit Testing, Data Migration, SQL, Software Architecture

Data Scientist

2023 - 2023
Stop the Traffik
  • Conducted a user experience discovery workshop to identify user pain points and needs.
  • Developed an entity sentiment model to predict article sentiments for business users using transfer learning.
  • Collaborated with the client and other Toptalers to bring the machine learning model into production using IBM Cloud.
Technologies: Amazon SageMaker, Python 3, PyTorch, Docker, IBM Cloud, MongoDB, Programming, Scikit-learn, Large Language Models (LLMs), OpenAI GPT-4 API, Modeling, Exploratory Data Analysis, EDA, Unstructured Data Analysis, Spreadsheets, APIs, Amazon EC2

Data Scientist | Management Consulting Services

2023 - 2023
D3 Management LLC
  • Integrated 3rd-party LMS data to a Sharepoint list and PostgreSQL for insight generation using APIs, Power Automate, and Airflow.
  • Built ETLs on premise core data to PostgreSQL using Power Automate.
  • Created Power BI dashboards for end users' consumption to accelerate time to gain insight and make informed decisions.
Technologies: Microsoft Power BI, SharePoint, Apache Airflow, Microsoft Power Automate, Heroku, PostgreSQL, Learning Management Systems (LMS), BI Reporting, Integration, Programming, Data Cleaning, APIs, Amazon EC2, Software Architecture

Lead Analyst

2021 - 2022
AXA Group
  • Introduced analytics into the digital sales team to deploy a retention machine learning model for motor policy, digital sales dashboard, and policy benefit package optimization, which improved the annual gross premium with a lower loss ratio.
  • Led the implementation of MLOps practice using AWS to monitor, track, maintain, and improve the existing or productionizing machine learning models to accelerate the development cycle from six to three months with greater transparency and visibility.
  • Joined forces with the finance and actuarial team to implement International Financial Reporting Standard (IFRS) 17 using a data lake and reporting tools like SAP Webi to provide timely and granular insights to stakeholders.
  • Headed data literacy initiatives across the enterprise by curating learning programs, assessment, mentoring programs, hiring, retention, and various engagement activities that uplifted the overall data literacy index from 33 to 50 (total 100).
Technologies: Python 3, Redshift, SQL, AWS Glue, SAP Business Intelligence (BI), Databases, GitHub, Microsoft Flow, Microsoft Power BI, SharePoint 365, Python, Data Analysis, Amazon Web Services (AWS), Amazon Machine Learning, Amazon S3 (AWS S3), Project Management, Analytics, Business Analysis, ETL, Excel 365, SharePoint, Pandas, NumPy, Data Wrangling, Jupiter, Machine Learning Operations (MLOps), Amazon SageMaker, Dashboards, Business Intelligence (BI), Amazon QuickSight, Data Pipelines, Regression, Classification, Linux, XGBoost, Data Scientist, Reports, Data Reporting, Cloud, AWS Fargate, Excel 2010, BI Reporting, Integration, Programming, Scikit-learn, Modeling, Exploratory Data Analysis, EDA, Consumer Behavior, Data Cleaning, Large Data Sets, Data Gathering, Amazon EC2

Data Science Manager

2019 - 2021
EY
  • Led, managed, and won more than $1 million in data analytics projects with the regional DnA team.
  • Headed and launched multiple analytics projects, including attrition modeling, language modeling, optimization, and dashboards, with teams of between 2 and 10 people that delivered positive economic impact to society and organizations.
  • Was in charge of people management for the regional team and was responsible for establishing and sustaining learning programs, events, and mentorship to support the team's continuous personal and career growth.
Technologies: Python 3, Spark ML, Spark SQL, Azure Databricks, Azure, Microsoft Power BI, Tableau, IT Consulting, Agile Project Management, Ubuntu 16.04, R, Jupyter Notebook, Python, Data Analysis, Data Visualization, Project Management, Analytics, Business Analysis, ETL, Pandas, NumPy, Data Wrangling, Jupiter, Data Engineering, Web Scraping, Dashboards, Business Intelligence (BI), Big Data, Data Pipelines, Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Networks, Keras, Linux, XGBoost, Data Scraping, Data Scientist, Reports, Heatmaps, Cloud, Text Classification, Excel 2010, BI Reporting, Integration, Programming, User Interface (UI), Scikit-learn, Modeling, Exploratory Data Analysis, EDA, Data Cleaning, Large Data Sets, Unstructured Data Analysis, Data Gathering, Spreadsheets, Azure Machine Learning, TensorFlow, APIs, Amazon EC2, BERT, Custom BERT, Software Architecture, Strategy, Large Language Models (LLMs)

Senior Associate – Data Science

2018 - 2019
EY
  • Developed a machine learning optimization model using ensemble models (XGBoost with Random Forests) for a refinery that estimated could save up to $100,000 for regional factories.
  • Created a social listening dashboard using text analysis for a utility company to improve digital reputation score (DRS) by enabling prompt responses to negative sentiments on social media platforms.
  • Built a face recognition and tracking MVP for intruder detection using transfer learning for a smart city prototype.
Technologies: Python 3, R, Brandwatch, Microsoft Power BI, RStudio, Python, Data Analysis, Data Extraction, Analytics, Business Analysis, Pandas, NumPy, Data Wrangling, Jupiter, Web Scraping, Dashboards, Business Intelligence (BI), Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Networks, Keras, Linux, XGBoost, Data Scraping, Data Scientist, Reports, Heatmaps, Cloud, Excel 2010, BI Reporting, Programming, Scikit-learn, Modeling, Exploratory Data Analysis, EDA, Data Cleaning, Data Gathering, Amazon EC2

Head of Research and Analytics

2016 - 2018
Dattel
  • Led a two-way function of expressing business strategy in analytics terms to key counterparts across a line of businesses and functional areas, such as software engineering, marketing, etc., after a merger between True Vox Asia and Dattel.
  • Designed, managed, and delivered data experiments, collections, and analytics services on a consumer intelligence platform as part of the product offerings to small and medium-sized businesses.
  • Managed hands-on multiple projects, including customer segmentation, brand advocacy, psychometric profiling, and social-economic study publications to enable strategy formation for consumer brands to acquire, engage and retain their customers.
Technologies: Python 3, PostgreSQL, Agile Project Management, SQL, R, Git, Design Thinking, Python, Analytics, Pandas, NumPy, Data Wrangling, Jupiter, A/B Testing, Product Development, Regression, Linux, XGBoost, Data Scientist, Heatmaps, Excel 2010, Programming, Modeling, Exploratory Data Analysis, EDA, Consumer Behavior, Data Cleaning, Data Gathering, Product Consultant, Marketing, Digital Advertising, Strategy

Motor Claim Part Price Prediction Model

A machine learning regression model to estimate the price of motor spare parts for the claim assessor, accelerate the claim processing time, and systematize the part price assessment procedure. Collaborating with business users and the AWS team, I started from a user persona, process understanding, and brainstorming session. The team devised a solution to use machine learning with a web app running on AWS architecture to recommend predicted prices for different car makes and models' spare parts. The final model used was ensemble gradient boosted trees (GBT) regressors. This initiative was expected to improve customer satisfaction as it also shortened the claim processing time and reduced inflation impact on spare part costs.

Customer Retention Model for Insurance Policies

A classification machine learning model that predicts which policies will likely be discontinued in the upcoming renewal cycle. Regional retail general insurance has a very competitive landscape, so it is imperative to monitor a churn rate and keep existing customers whenever required. The development team collaborated with business users to understand essential features that would be beneficial to use in churn prediction. As part of the data scientists team, I used past years' data to support the claims. The team built a random forest (RF) model to help end users retain the fence-sitters, i.e., indifferent customers and likely-to-churn customers, with incentives or targeted promotions.

Natural Language Processing and Modeling

The project was designed to predict the named entity, document sentiment and entity sentiment, locality, and emotion with higher accuracy. In 2019-2020, Bert was state of the art (SoTA) for NLP. Still, it was not giving satisfying results for ASEAN languages (e.g., Indonesian, ASEAN Chinese, Malay, etc.) as the underlying models were using mostly US corpus. The team aimed to beat the SoTA for ASEAN languages. From data collection, curation, annotation, QA, data cleaning, model training, evaluation, and testing, the team of 10 was doing end-to-end delivery for our client. I was the owner of one of the language pillars while also helping other language pillars with context understanding for regional languages. The team ran multiple ML experiments and found the transformer model worked the best for a few language corpora and a mixture of them, hence applying the large embedding to multiple language models. As the end deliverable, the team containerized the models and Python pipeline using Docker. The models were able to beat SoTA with around a 0.7 F1 score.

Employee Retention Prediction

A regional outsourcing and shared service experienced high employee turnover in the past year and wished to reduce attrition, especially for the higher performer. A team of five conducted qualitative and quantitative analytics to understand the employee pain points and proposed a solution to the client. I acted as a data scientist to conduct exploratory analytics using employee attendance and leave, appraisal and performance, bonus and pay, and keyword extraction from interview sessions using Python, R, and Power BI. The insights were then matched with attrition data to predict attrition likelihood with a classification algorithm and summarize (clustering and PCA) the attrition factors so that the client could prioritize and take action on high-impact attrition factors.
2014 - 2016

Master's Degree in Statistics

University of Malaya - Kuala Lumpur, Malaysia

2008 - 2011

Bachelor's Degree in Mathematics

University of Science, Malaysia - Penang, Malaysia

MARCH 2023 - MARCH 2026

AWS Certified Machine Learning

AWS

MAY 2022 - PRESENT

Enterprise Design Thinking Co-creator

IBM

DECEMBER 2021 - JANUARY 2024

Power Platform Solution Architect Expert

Microsoft

SEPTEMBER 2021 - SEPTEMBER 2023

Power Platform Functional Consultant Associate

Microsoft

JULY 2021 - JULY 2023

Google Cloud Certified Professional – Machine Learning Engineer

Google Cloud

JULY 2021 - JULY 2023

GCP Professional Cloud Architect

GCP

JANUARY 2021 - JANUARY 2024

Azure AI Engineer Associate

Microsoft

DECEMBER 2020 - DECEMBER 2023

Azure Data Science Associate

Microsoft

AUGUST 2020 - AUGUST 2023

Azure Solution Architect Expert

Microsoft

NOVEMBER 2019 - PRESENT

Professional Scrum Master I

Scrum.org

Libraries/APIs

Pandas, NumPy, Keras, XGBoost, Scikit-learn, Spark ML, PyTorch, TensorFlow

Tools

Microsoft Power BI, Microsoft Flow, Tableau, Git, Amazon SageMaker, Spreadsheets, Azure Machine Learning, AWS Glue, GitHub, Spark SQL, Microsoft Power Apps, Named-entity Recognition (NER), Amazon QuickSight, AWS Fargate, Apache Airflow, Excel 2010, Bitbucket, Jira

Languages

Python 3, Python, SQL, R

Paradigms

Agile Project Management, Data Science, Business Intelligence (BI), Design Thinking, Agile, ETL, Unit Testing

Platforms

Azure, Jupyter Notebook, Microsoft Power Automate, SharePoint 365, Amazon Web Services (AWS), Linux, Amazon EC2, Microsoft Power Platform, Docker, Ubuntu, RStudio, SharePoint, Google Cloud Platform (GCP), Visual Studio Code (VS Code), Heroku

Storage

PostgreSQL, Databases, Amazon S3 (AWS S3), Redshift, Data Pipelines, MongoDB, Microsoft SQL Server

Industry Expertise

Project Management, Marketing

Frameworks

Flutter, .NET 3

Other

Ubuntu 16.04, Machine Learning, Data Analytics, Analytics, Data Wrangling, Jupiter, Data Scientist, Modeling, Exploratory Data Analysis, EDA, Statistics, SAP Business Intelligence (BI), IT Consulting, Natural Language Processing (NLP), Data Analysis, Data Visualization, Amazon Machine Learning, Business Analysis, Dashboards, Big Data, A/B Testing, Product Development, Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Networks, Reports, Heatmaps, Data Reporting, Cloud, Text Classification, Programming, Consumer Behavior, Data Cleaning, Large Data Sets, Unstructured Data Analysis, Data Gathering, APIs, BERT, Custom BERT, Software Architecture, Azure Databricks, Scrum Master, Artificial Intelligence (AI), Solution Design, Solution Architecture, IT Project Management, Communication, Root Cause Analysis, Sentiment Analysis, Classification Algorithms, Principal Component Analysis (PCA), Multivariate Statistical Modeling, Brandwatch, Data Extraction, Cost Reduction & Optimization (Cost-down), Excel 365, Data Engineering, Language Models, Machine Learning Operations (MLOps), Google Cloud ML, Text Analytics, Web Scraping, Data Scraping, Generative Pre-trained Transformers (GPT), Learning Management Systems (LMS), IBM Cloud, BI Reporting, Integration, User Interface (UI), Large Language Models (LLMs), OpenAI GPT-4 API, FastAPI, Data Migration, Product Consultant, Digital Advertising, Strategy, OpenAI, Transformer Models, Social Listening, Education Technology (Edtech)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring