Joslyn Lim, Developer in Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia
Joslyn is available for hire
Hire Joslyn

Joslyn Lim

Verified Expert  in Engineering

Data Developer

Kuala Lumpur Federal Territory of Kuala Lumpur, Malaysia

Toptal member since September 28, 2022

Bio

Joslyn is a seasoned data practitioner with demonstrated experience across multiple industries, including technology consulting and customer service. With her academic background in applied statistics and a skillset in machine learning, data analytics, Python, and SQL, Joslyn has delivered numerous projects with positive business impacts on customers.

Portfolio

Dattel
Agile Development, Amazon Machine Learning, Amazon QuickSight, Flutter UI, .NET...
Kognitiv
Python, PostgreSQL, Tableau Development, SQL Server, Bitbucket, Jira, FastAPI...
Stop the Traffik
SageMaker, Python, PyTorch, Docker, IBM Cloud, MongoDB, Freelance Programming...

Experience

Availability

Part-time

Preferred Environment

Visual Studio Code (VS Code), Python 3, SQL

The most amazing...

...project I've worked on was building a data lake, data warehouse, dashboards, and four machine learning use cases within 12 months with over 20 team members.

Work Experience

Chief Technology Officer

2023 - PRESENT
Dattel
  • Led the product development for an ad copy and ad creative generator with a team of five using GPT3.5 and stable diffusion. The MVP was showcased in a regional marketing event which brought in around 100 usage in the first week.
  • Built the IT policy and data governance framework from scratch with GCR personnel to safeguard company assets from cyber threats.
  • Upgraded the development cycle for both the engineering and data teams to adopt a CI/CD practice, resulting in approximately 20% development time saved at the first stage.
Technologies: Agile Development, Amazon Machine Learning, Amazon QuickSight, Flutter UI, .NET, Software Architecture, Product Consultants, Marketing Design, Digital Advertising, Design Strategy, OpenAI

Data Science Manager

2023 - 2023
Kognitiv
  • Performed pre and post campaign analysis for both ATL and BTL campaigns using SQL, Python and R to understand the feasibility and effectiveness of the campaigns which were able to achieve an average uplift of 2-3% during non-festive seasons.
  • Led the analytics team in migration and integration tests post-implementation of data and machine learning model migration for regional clients using Python, SQL, and X-Ray, which saved the effort for up to 2 FTE (manual tester).
  • Advised both internal and external stakeholders on data analytics methodologies, campaign A/B testing, and technology architecture to achieve desirable outcomes.
Technologies: Python, PostgreSQL, Tableau Development, SQL Server, Bitbucket, Jira, FastAPI, Unit Testing, Data Migration, SQL, Software Architecture

Data Scientist

2023 - 2023
Stop the Traffik
  • Conducted a user experience discovery workshop to identify user pain points and needs.
  • Developed an entity sentiment model to predict article sentiments for business users using transfer learning.
  • Collaborated with the client and other Toptalers to bring the machine learning model into production using IBM Cloud.
Technologies: SageMaker, Python, PyTorch, Docker, IBM Cloud, MongoDB, Freelance Programming, Scikit-Learn, LLM, GPT-4, Modeling, Data Science, EDA, Unstructured Data Analysis, Spreadsheets, APIs, Amazon EC2

Data Scientist | Management Consulting Services

2023 - 2023
D3 Management LLC
  • Integrated 3rd-party LMS data to a Sharepoint list and PostgreSQL for insight generation using APIs, Power Automate, and Airflow.
  • Built ETLs on premise core data to PostgreSQL using Power Automate.
  • Created Power BI dashboards for end users' consumption to accelerate time to gain insight and make informed decisions.
Technologies: Business Intelligence Development, SharePoint Design, Apache Airflow, Microsoft Power Automate, Heroku, PostgreSQL, Learning Management Systems (LMS), BI Reporting, Integration, Freelance Programming, Data Cleaning, APIs, Amazon EC2, Software Architecture

Lead Analyst

2021 - 2022
AXA Group
  • Introduced analytics into the digital sales team to deploy a retention machine learning model for motor policy, digital sales dashboard, and policy benefit package optimization, which improved the annual gross premium with a lower loss ratio.
  • Led the implementation of MLOps practice using AWS to monitor, track, maintain, and improve the existing or productionizing machine learning models to accelerate the development cycle from six to three months with greater transparency and visibility.
  • Joined forces with the finance and actuarial team to implement International Financial Reporting Standard (IFRS) 17 using a data lake and reporting tools like SAP Webi to provide timely and granular insights to stakeholders.
  • Headed data literacy initiatives across the enterprise by curating learning programs, assessment, mentoring programs, hiring, retention, and various engagement activities that uplifted the overall data literacy index from 33 to 50 (total 100).
Technologies: Python, Redshift, SQL, AWS Glue, SAP Business Intelligence (BI), Database, GitHub, Microsoft Flow, Business Intelligence Development, SharePoint Development, Python, Data Analysis, AWS, Amazon Machine Learning, Amazon S3, Virtual Coaching, Analytics Development, Business Analysis Consulting, ETL, Excel 365, SharePoint Design, Pandas, NumPy, Data Science, Jupiter, Machine Learning Operations (MLOps), SageMaker, Dashboard, Business Intelligence Development, Amazon QuickSight, Database, Regression, Classification, Linux, XGBoost, Data Science, Reports, Data Science, Cloud Engineering, AWS Fargate, Excel Development, BI Reporting, Integration, Freelance Programming, Scikit-Learn, Modeling, Data Science, EDA, Consumer Behavior, Data Cleaning, Large Data Sets, Data Gathering, Amazon EC2

Data Science Manager

2019 - 2021
EY
  • Led, managed, and won more than $1 million in data analytics projects with the regional DnA team.
  • Headed and launched multiple analytics projects, including attrition modeling, language modeling, optimization, and dashboards, with teams of between 2 and 10 people that delivered positive economic impact to society and organizations.
  • Was in charge of people management for the regional team and was responsible for establishing and sustaining learning programs, events, and mentorship to support the team's continuous personal and career growth.
Technologies: Python, Apache, Spark, Azure Databricks, Azure Design, Business Intelligence Development, Tableau Development, IT Consultant, Agile Development, Linux, R, Jupyter Notebook, Python, Data Analysis, Data Visualization, Virtual Coaching, Analytics Development, Business Analysis Consulting, ETL, Pandas, NumPy, Data Science, Jupiter, Data Engineering, Web Scraping, Dashboard, Business Intelligence Development, Big Data Architecture, Database, Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Network, Keras, Linux, XGBoost, Data Scraping, Data Science, Reports, Heatmaps, Cloud Engineering, Text Classification, Excel Development, BI Reporting, Integration, Freelance Programming, UI Development, Scikit-Learn, Modeling, Data Science, EDA, Data Cleaning, Large Data Sets, Unstructured Data Analysis, Data Gathering, Spreadsheets, Azure Machine Learning, TensorFlow, APIs, Amazon EC2, BERT, Custom BERT, Software Architecture, Design Strategy, LLM

Senior Associate – Data Science

2018 - 2019
EY
  • Developed a machine learning optimization model using ensemble models (XGBoost with Random Forests) for a refinery that estimated could save up to $100,000 for regional factories.
  • Created a social listening dashboard using text analysis for a utility company to improve digital reputation score (DRS) by enabling prompt responses to negative sentiments on social media platforms.
  • Built a face recognition and tracking MVP for intruder detection using transfer learning for a smart city prototype.
Technologies: Python, R, Brandwatch, Business Intelligence Development, RStudio, Python, Data Analysis, Data Extraction, Analytics Development, Business Analysis Consulting, Pandas, NumPy, Data Science, Jupiter, Web Scraping, Dashboard, Business Intelligence Development, Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Network, Keras, Linux, XGBoost, Data Scraping, Data Science, Reports, Heatmaps, Cloud Engineering, Excel Development, BI Reporting, Freelance Programming, Scikit-Learn, Modeling, Data Science, EDA, Data Cleaning, Data Gathering, Amazon EC2

Head of Research and Analytics

2016 - 2018
Dattel
  • Led a two-way function of expressing business strategy in analytics terms to key counterparts across a line of businesses and functional areas, such as software engineering, marketing, etc., after a merger between True Vox Asia and Dattel.
  • Designed, managed, and delivered data experiments, collections, and analytics services on a consumer intelligence platform as part of the product offerings to small and medium-sized businesses.
  • Managed hands-on multiple projects, including customer segmentation, brand advocacy, psychometric profiling, and social-economic study publications to enable strategy formation for consumer brands to acquire, engage and retain their customers.
Technologies: Python, PostgreSQL, Agile Development, SQL, R, Git, Design Thinking, Python, Analytics Development, Pandas, NumPy, Data Science, Jupiter, A/B Testing, Product Development, Regression, Linux, XGBoost, Data Science, Heatmaps, Excel Development, Freelance Programming, Modeling, Data Science, EDA, Consumer Behavior, Data Cleaning, Data Gathering, Product Consultants, Marketing Design, Digital Advertising, Design Strategy

Motor Claim Part Price Prediction Model

A machine learning regression model to estimate the price of motor spare parts for the claim assessor, accelerate the claim processing time, and systematize the part price assessment procedure. Collaborating with business users and the AWS team, I started from a user persona, process understanding, and brainstorming session. The team devised a solution to use machine learning with a web app running on AWS architecture to recommend predicted prices for different car makes and models' spare parts. The final model used was ensemble gradient boosted trees (GBT) regressors. This initiative was expected to improve customer satisfaction as it also shortened the claim processing time and reduced inflation impact on spare part costs.

Customer Retention Model for Insurance Policies

A classification machine learning model that predicts which policies will likely be discontinued in the upcoming renewal cycle. Regional retail general insurance has a very competitive landscape, so it is imperative to monitor a churn rate and keep existing customers whenever required. The development team collaborated with business users to understand essential features that would be beneficial to use in churn prediction. As part of the data scientists team, I used past years' data to support the claims. The team built a random forest (RF) model to help end users retain the fence-sitters, i.e., indifferent customers and likely-to-churn customers, with incentives or targeted promotions.

Natural Language Processing and Modeling

The project was designed to predict the named entity, document sentiment and entity sentiment, locality, and emotion with higher accuracy. In 2019-2020, Bert was state of the art (SoTA) for NLP. Still, it was not giving satisfying results for ASEAN languages (e.g., Indonesian, ASEAN Chinese, Malay, etc.) as the underlying models were using mostly US corpus. The team aimed to beat the SoTA for ASEAN languages. From data collection, curation, annotation, QA, data cleaning, model training, evaluation, and testing, the team of 10 was doing end-to-end delivery for our client. I was the owner of one of the language pillars while also helping other language pillars with context understanding for regional languages. The team ran multiple ML experiments and found the transformer model worked the best for a few language corpora and a mixture of them, hence applying the large embedding to multiple language models. As the end deliverable, the team containerized the models and Python pipeline using Docker. The models were able to beat SoTA with around a 0.7 F1 score.

Employee Retention Prediction

A regional outsourcing and shared service experienced high employee turnover in the past year and wished to reduce attrition, especially for the higher performer. A team of five conducted qualitative and quantitative analytics to understand the employee pain points and proposed a solution to the client. I acted as a data scientist to conduct exploratory analytics using employee attendance and leave, appraisal and performance, bonus and pay, and keyword extraction from interview sessions using Python, R, and Power BI. The insights were then matched with attrition data to predict attrition likelihood with a classification algorithm and summarize (clustering and PCA) the attrition factors so that the client could prioritize and take action on high-impact attrition factors.
2014 - 2016

Master's Degree in Statistics

University of Malaya - Kuala Lumpur, Malaysia

2008 - 2011

Bachelor's Degree in Mathematics

University of Science, Malaysia - Penang, Malaysia

MARCH 2023 - MARCH 2026

AWS Certified Machine Learning

AWS

MAY 2022 - PRESENT

Enterprise Design Thinking Co-creator

IBM

DECEMBER 2021 - JANUARY 2024

Power Platform Solution Architect Expert

Microsoft

SEPTEMBER 2021 - SEPTEMBER 2023

Power Platform Functional Consultant Associate

Microsoft

JULY 2021 - JULY 2023

Google Cloud Certified Professional – Machine Learning Engineer

Google Cloud

JULY 2021 - JULY 2023

GCP Professional Cloud Architect

GCP

JANUARY 2021 - JANUARY 2024

Azure AI Engineer Associate

Microsoft

DECEMBER 2020 - DECEMBER 2023

Azure Data Science Associate

Microsoft

AUGUST 2020 - AUGUST 2023

Azure Solution Architect Expert

Microsoft

NOVEMBER 2019 - PRESENT

Professional Scrum Master I

Scrum.org

Libraries/APIs

Pandas, NumPy, Keras, XGBoost, Scikit-Learn, Apache, PyTorch, TensorFlow

Tools

Business Intelligence Development, Microsoft Flow, Tableau Development, Git, SageMaker, Spreadsheets, Azure Machine Learning, AWS Glue, GitHub, Spark, Microsoft Power Apps, Named-entity Recognition (NER), Amazon QuickSight, AWS Fargate, Apache Airflow, Excel Development, Bitbucket, Jira

Languages

Python, Python, SQL, R

Paradigms

Agile Development, Business Intelligence Development, Design Thinking, Agile Development, ETL, Unit Testing

Platforms

Azure Design, Linux, Jupyter Notebook, Microsoft Power Automate, SharePoint Development, AWS, Linux, Amazon EC2, Microsoft Power Platform, Docker, Ubuntu, RStudio, SharePoint Design, Cloud Engineering, Visual Studio Development, Heroku

Storage

PostgreSQL, Database, Amazon S3, Redshift, Database, MongoDB, SQL Server

Industry Expertise

Virtual Coaching, Marketing Design

Frameworks

Flutter UI, .NET

Other

Data Science, Machine Learning, Data Science, Analytics Development, Data Science, Jupiter, Data Science, Modeling, Data Science, EDA, Statistics, SAP Business Intelligence (BI), IT Consultant, NLP, Data Analysis, Data Visualization, Amazon Machine Learning, Business Analysis Consulting, Dashboard, Big Data Architecture, A/B Testing, Product Development, Artificial Neural Networks (ANN), Regression, Deep Learning, Classification, Neural Network, Reports, Heatmaps, Data Science, Cloud Engineering, Text Classification, Freelance Programming, Consumer Behavior, Data Cleaning, Large Data Sets, Unstructured Data Analysis, Data Gathering, APIs, BERT, Custom BERT, Software Architecture, Azure Databricks, Scrum Master Consulting, Artificial Intelligence, Solution Design, Solution Architecture, IT Project Management, Communication Coaching, Root Cause Analysis, Sentiment Analysis, Classification Algorithms, Principal Component Analysis (PCA), Multivariate Statistical Modeling, Brandwatch, Data Extraction, Cost Reduction & Optimization (Cost-down), Excel 365, Data Engineering, Language Models, Machine Learning Operations (MLOps), Google Cloud ML, Text Analytics, Web Scraping, Data Scraping, Generative Pre-trained Transformers (GPT), Learning Management Systems (LMS), IBM Cloud, BI Reporting, Integration, UI Development, LLM, GPT-4, FastAPI, Data Migration, Product Consultants, Digital Advertising, Design Strategy, OpenAI, Transformer Models, Social Listening, Education Technology (Edtech)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring