Wenlong Dong, Developer in Sydney, New South Wales, Australia
Wenlong is available for hire
Hire Wenlong

Wenlong Dong

Verified Expert  in Engineering

Database Developer

Sydney, New South Wales, Australia

Toptal member since January 21, 2022

Bio

Wenlong is a senior data engineer with over five years of experience building data and ETL solutions, primarily in SQL and Python. He has vast experience building data pipelines and is familiar with various tools like dbt, Snowflake, Redshift, Python, Airflow, Power BI, Excel VBA, and PowerShell. Wenlong has led projects including fuzzy mapping in Python, end-to-end data pipeline with Dataiku and Anaplan, Salesforce data migration, and omnichannel models in dbt, Redshift, and Airflow.

Portfolio

AstraZeneca
Business Intelligence Development, Snowflake, Apache Airflow, Python, SQL, SQL...
IBM
Python, Salesforce Design, SQL, IBM Cloud, GitHub, Data Analysis...
University of New South Wales
STATA, R, Excel VBA, Data Analysis, Dashboard, SQL, Data Engineering...

Experience

Availability

Part-time

Preferred Environment

PyCharm, Windows, SQL Server 2016, Visual Studio Code (VS Code), SQL Server Integration Services (SSIS), Snowflake, Redshift, Python 3

The most amazing...

...project I've independently designed and completed is a complex medical data validation platform with built-in validation rules using Excel VBA.

Work Experience

Data Engineer

2022 - PRESENT
AstraZeneca
  • Supported the analytics team for Microsoft Power BI reporting.
  • Created a Power BI data flow and built report templates.
  • Developed and maintained a Snowflake-based data warehouse via DBT.
  • Administrated the Snowflake data warehouse and supported data users with troubleshooting issues.
  • Built and maintained Apache Airflow schedules. Completed BAU and troubleshooting tasks.
Technologies: Business Intelligence Development, Snowflake, Apache Airflow, Python, SQL, SQL, Dataiku, Data Visualization, Data Build Tool (dbt), Data Science, Data Analysis, Redshift, Analytics Development, Database, T-SQL, SQL DML, Database, SQL, Performance Tuning, Automated Data Flows, AWS, CI/CD Pipelines, ETL Tools, Business Intelligence (BI) Platforms, SQL, Stored Procedure, JSON, PostgreSQL, Amazon S3, Excel Development, Excel 365, Excel Development, MySQL, ELT, BI Reporting, Database, Data Transformation, Data Science, Dashboard Development, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Data Warehouse, Microsoft Word, Windows Development

Data Engineer

2021 - 2022
IBM
  • Participated as the primary data engineer in a Salesforce data migration project using Python, SQL, and Salesforce APEX.
  • Completed training and learning activities in Hadoop and MongoDB.
  • Worked in an Agile team with a CI/CD development method implemented.
  • Contributed as the primary data engineer for a data migration project with Python-based development.
Technologies: Python, Salesforce Design, SQL, IBM Cloud, GitHub, Data Analysis, Data Engineering, SQL Server, SQL, ETL, SQL Server, MongoDB, Database Administration (DBA), T-SQL, Docker, ETL Development, Data Warehouse, Data Architecture, Pandas, Data Modeling, ETL Testing, Database Modeling, Schemas, Excel Development, Data Science, Analytics Development, Database, SQL DML, Database, SQL, Performance Tuning, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, CI/CD Pipelines, ETL Tools, Stored Procedure, PostgreSQL, Excel Development, Excel 365, Excel Development, MySQL, BI Reporting, Database, Data Transformation, Data Science, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Data Warehouse, MacOS, Microsoft Word

Data Management Officer

2020 - 2021
University of New South Wales
  • Designed and developed a complete data solution with STATA, including data cleansing modules, data validation, and generating statistical reports.
  • Independently designed and developed a medical data collection and validation platform with Excel VBA.
  • Built an R-based model for data cleansing and producing academic reports.
  • Designed and developed SQL Server-based databases and relevant stored procedures.
  • Built PowerBI dashboard with SQL SERVER data source to analyze historical genetic test data with interactive reports instead of multiple spreadsheets.
Technologies: STATA, R, Excel VBA, Data Analysis, Dashboard, SQL, Data Engineering, SQL Server, SQL, SQL Server, Database Administration (DBA), T-SQL, ETL Development, Data Science, Business Intelligence Development, Data Architecture, Pandas, Data Modeling, Database Modeling, Schemas, Business Intelligence Development, Reports, Reporting, Excel Development, Data Science, Analytics Development, SQL DML, Database, SQL, Performance Tuning, ETL Tools, Business Intelligence (BI) Platforms, Stored Procedure, PostgreSQL, Excel Development, Excel 365, Excel Development, BI Reporting, Database, Data Transformation, Data Science, Dashboard Development, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Data Warehouse, Visual Basic, Visual Basic, MacOS, Microsoft Word, Windows Development

PowerShell Developer

2019 - 2020
Macquarie Bank
  • Designed and built SSIS solutions to create an ETL pipeline between the central data warehouse and a financial analysis platform.
  • Developed a file loading system and data processing jobs with Control-M job flows and PowerShell-based functions.
  • Contributed to the data lake project with a Hive data warehouse.
Technologies: Windows PowerShell, SQL Server, Control-M, SourceTree, Jira, SSIS, JSON, YAML, SQL, Data Engineering, SQL Server, SQL, ETL, SQL Server, T-SQL, ETL Development, Data Warehouse, Data Modeling, ETL Testing, Database Modeling, Schemas, Excel Development, Data Analysis, Analytics Development, Database, SQL DML, Database, SQL, Performance Tuning, AWS, CI/CD Pipelines, ETL Tools, Stored Procedure, PostgreSQL, Amazon S3, Excel Development, Excel 365, Excel Development, ELT, Database, Data Transformation, Data Science, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Data Warehouse, Visual Basic, Microsoft Word, Windows Development

Data Developer

2018 - 2019
CoreLogic AU
  • Completed a massive data warehouse and data loading pipeline upgrade based on the business rules boost for Australian property data.
  • Supported all BAU processes for the entire data team and the property data platform, including troubleshooting SQL agent jobs, AWS environments, and SSIS packages.
  • Performed detailed analysis on geographic data items. Built a data loading and validation process for geographic data types in SQL Server.
  • Created dynamic SQL processes to optimize the SQL Server performance on giant data tables with more than one million records.
Technologies: SQL Server, BIML, XML, Jira, Confluence, Agile Development, Python, Unit Testing, SSIS, Data Analysis, Dashboard, SQL, Data Engineering, SQL Server, SQL, ETL, Tableau Development, SQL Server, T-SQL, ETL Development, Data Warehouse, Business Intelligence Development, Pandas, Data Modeling, ETL Testing, Database Modeling, Schemas, Reports, Reporting, Excel Development, Data Science, Analytics Development, Database, SQL DML, Database, SQL, Performance Tuning, AWS, CI/CD Pipelines, ETL Tools, Stored Procedure, PostgreSQL, Amazon S3, Excel Development, Excel 365, Excel Development, ELT, Database, Data Transformation, Data Science, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Data Warehouse, Microsoft Word

SyteLine and System Support Officer

2017 - 2018
Le Mac Australia Group
  • Designed and maintained the Infor SyteLine ERP system.
  • Designed Crystal Reports and written relevant SQL Server stored procedures.
  • Analyzed production cost data and manipulated data calculation via SQL Server and Excel Pivot Table.
Technologies: SQL Server, Crystal Reports, SyteLine ERP, C#, Pivot Tables, SQL Server, SQL, SQL Server, Database Administration (DBA), T-SQL, Database Modeling, Schemas, Excel Development, SQL DML, Database, SQL, Performance Tuning, SQL, Stored Procedure, PostgreSQL, Excel Development, Excel 365, Excel Development, Database, Data Transformation, Data Science, Dashboard Development, Data Science, Information Gathering, Relational Databases, Data Manipulation, Query Optimization, Microsoft Word, Windows Development

Customer Fuzzy Matching Project in Python and Dataiku

The project aimed to map customer data to government-published datasets through the limited fields available—names, occupations, and business addresses. The data sources included Redshift, CSV files, and XML files. The project's first phase was built exclusively in Python, which completed 60% of the total customers mapped. The project's second phase was built in Dataiku, and an additional 20% of the total customer mapping was achieved. I was a project solution designer and builder.

Anaplan Data Integration

A Redshift-based data model that consists of several tables and views of sales data built via dbt. The data objects are refreshed daily or monthly in Airflow. As the project designer and builder, I contributed to building dbt macros to export the tables and views to the S3 bucket as CSV files. We also created Anaplan CloudWorks jobs to consume the CSV files regularly.

SalesForce Data Migration Project

Oversaw, as part of a team, the migration of Salesforce data from the source environment to the target environment. The client wished to separate part of its business into an independent Salesforce environment.

I set up the primary Python framework and built the initial version of the data extraction process—from Salesforce to Python DataFrame. I created the complete solution for duplicate records identification and merging dup records. I designed and developed the parallel computing process for comparing huge amounts of data as well as the grouping logic based on Graph theory. I also designed and built many SQL Server objects, including views, stored procedures, and functions.

Excel VBA-based Medical Data Validation Platform

I designed and completed a medical data validation platform with Excel VBA independently. I implemented complex validation rules within the Excel modules so that users could have data automatically and entirely validated in Excel.

This platform has been accepted and used for the data collection process worldwide.

ETL Solution to Update Existing Real Estate Data

A property data ETL solution project aimed at manipulating existing ETL data flow to fit new government requirements. I was one of the primary SQL Server and SSIS solution developers and completed approximately 50% of the development tasks.
2020 - 2021

Graduate Certificate in Health Data Science

University of New South Wales - Sydney, NSW, Australia

2013 - 2014

Master's Degree in Information Systems

The University of Melbourne - Melbourne, Victoria, Australia

2007 - 2011

Bachelor's Degree in Logistics and Supply Chain Management

Huazhong University of Science and Technology - Wuhan, Hubei, China

MARCH 2022 - PRESENT

Microsoft Certified: Azure Fundamentals

Microsoft

MARCH 2017 - PRESENT

ITIL Foundation Certificate in IT Service Management

AXELOS

Libraries/APIs

Pandas, Python

Tools

STATA, Business Intelligence Development, Jira, Confluence, Spreadsheets, Excel Development, Excel Development, Excel Development, Microsoft Word, PyCharm, MATLAB, GitHub, Tableau Development, Apache Airflow, MySQL, Control-M, SourceTree, Crystal Reports, CloudWorx

Languages

Python, Python, SQL, Excel VBA, T-SQL, Snowflake, SQL DML, Stored Procedure, Visual Basic, Visual Basic, R, SAS, Java, C, YAML, BIML, XML, C#

Paradigms

ETL, Business Intelligence Development, Dimensional Modeling, Agile Development, Unit Testing

Platforms

Visual Studio Development, MacOS, Windows Development, AWS, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), Salesforce Design, Docker, Azure Design, Azure PaaS, Azure, Salesforce Development, Linux, Windows Development, Amazon EC2, Dataiku, Anaplan

Storage

SQL Server, SSIS, Database, SQL, SQL Server, MySQL, SQL Server, Database Administration (DBA), Database Modeling, Redshift, Database, SQL, PostgreSQL, Amazon S3, Relational Databases, JSON, Database Performance, SQL, Azure Blobs, MongoDB, SQL

Frameworks

Windows PowerShell

Other

Data Engineering, Data Warehouse, Data Analysis, Data Cleaning, ETL Development, Data Modeling, ETL Testing, Schemas, Data Science, Analytics Development, Database, Performance Tuning, CI/CD Pipelines, ETL Tools, Excel 365, BI Reporting, Data Transformation, Data Science, Dashboard Development, Data Science, Information Gathering, Data Manipulation, Query Optimization, Data Warehouse, Statistics, Dashboard, Data Science, Data Architecture, Reports, Reporting, Data Build Tool (dbt), Automated Data Flows, Business Intelligence (BI) Platforms, ELT, Manufacturing Resource Planning (MRP), Knowledge Management, Minitab, Data Science, Linear Algebra, IBM Cloud, IT Service Management (ITSM), Web Scraping, SyteLine ERP, Pivot Tables, Multiprocessing, Data Visualization, Fuzzy Logic

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring