
Wenlong Dong
Database Developer
Wenlong is a senior data engineer with over half a decade of experience building data and ETL solutions primarily in SQL and Python. He has strong experience in building data pipelines and is familiar with various tools like DBT, Snowflake, Redshift, Python, Airflow, Power BI, Excel VBA, and PowerShell. Finally, he has led projects including fuzzy mapping in Python, end-to-end data pipeline with Dataiku and Anaplan, Salesforce data migration, and omnichannel models in DBT, Redshift, and Airflow.
Portfolio
Experience
SQL Server 2016 - 5 yearsSQL Server Integration Services (SSIS) - 5 yearsVisual Studio Code (VS Code) - 2 yearsGitHub - 2 yearsR - 2 yearsExcel VBA - 2 yearsPython 3 - 2 yearsSTATA - 2 yearsAvailability
Preferred Environment
PyCharm, Windows, SQL Server 2016, Visual Studio Code (VS Code), SQL Server Integration Services (SSIS)
The most amazing...
...project I've independently designed and completed is a complex medical data validation platform with built-in validation rules using Excel VBA.
Work Experience
Data Engineer
AstraZeneca
- Supported the analytics team for Microsoft Power BI reporting.
- Created a Power BI data flow and built report templates.
- Developed and maintained a Snowflake-based data warehouse via DBT.
- Administrated the Snowflake data warehouse and supported data users with troubleshooting issues.
- Built and maintained Apache Airflow schedules. Completed BAU and troubleshooting tasks.
Data Engineer
IBM
- Participated as the primary data engineer in a Salesforce data migration project using Python, SQL, and Salesforce APEX.
- Completed training and learning activities in Hadoop and MongoDB.
- Worked in an Agile team with a CI/CD development method implemented.
- Contributed as the primary data engineer for a data migration project with Python-based development.
Data Management Officer
University of New South Wales
- Designed and developed a complete data solution with STATA, including data cleansing modules, data validation, and generating statistical reports.
- Independently designed and developed a medical data collection and validation platform with Excel VBA.
- Built an R-based model for data cleansing and producing academic reports.
- Designed and developed SQL Server-based databases and relevant stored procedures.
- Built PowerBI dashboard with SQL SERVER data source to analyze historical genetic test data with interactive reports instead of multiple spreadsheets.
PowerShell Developer
Macquarie Bank
- Designed and built SSIS solutions to create an ETL pipeline between the central data warehouse and a financial analysis platform.
- Developed a file loading system and data processing jobs with Control-M job flows and PowerShell-based functions.
- Contributed to the data lake project with a Hive data warehouse.
Data Developer
CoreLogic AU
- Completed a massive data warehouse and data loading pipeline upgrade based on the business rules boost for Australian property data.
- Supported all BAU processes for the entire data team and the property data platform, including troubleshooting SQL agent jobs, AWS environments, and SSIS packages.
- Performed detailed analysis on geographic data items. Built a data loading and validation process for geographic data types in SQL Server.
- Created dynamic SQL processes to optimize the SQL Server performance on giant data tables with more than one million records.
SyteLine and System Support Officer
Le Mac Australia Group
- Designed and maintained the Infor SyteLine ERP system.
- Designed Crystal Reports and written relevant SQL Server stored procedures.
- Analyzed production cost data and manipulated data calculation via SQL Server and Excel Pivot Table.
Experience
Customer Fuzzy Matching Project in Python and Dataiku
Anaplan Data Integration
SalesForce Data Migration Project
I set up the primary Python framework and built the initial version of the data extraction process—from Salesforce to Python DataFrame. I created the complete solution for duplicate records identification and merging dup records. I designed and developed the parallel computing process for comparing huge amounts of data as well as the grouping logic based on Graph theory. I also designed and built many SQL Server objects, including views, stored procedures, and functions.
Excel VBA-based Medical Data Validation Platform
This platform has been accepted and used for the data collection process worldwide.
ETL Solution to Update Existing Real Estate Data
Skills
Languages
Python 3, Python, SQL, Excel VBA, T-SQL (Transact-SQL), Snowflake, SQL DML, Stored Procedure, Visual Basic for Applications (VBA), Visual Basic, R, SAS, Java, C, YAML, BIML, XML, C#
Libraries/APIs
Pandas, NetworkX
Tools
STATA, Microsoft Power BI, Jira, Confluence, Spreadsheets, Microsoft Excel, Excel 2010, Excel 2016, PyCharm, MATLAB, GitHub, Tableau, Apache Airflow, MySQL Workbench, Control-M, SourceTree, Crystal Reports
Paradigms
ETL, Business Intelligence (BI), Dimensional Modeling, Data Science, Agile, Unit Testing
Platforms
Visual Studio Code (VS Code), Amazon Web Services (AWS), Salesforce, Docker, Azure, Azure PaaS, Azure IaaS, Salesforce SOQL/SOSL, Linux, Windows Server 2016, Amazon EC2, Dataiku, Anaplan
Storage
SQL Server 2016, SQL Server Integration Services (SSIS), Databases, SQL Stored Procedures, SQL Server DBA, MySQL, Microsoft SQL Server, Database Administration (DBA), Database Modeling, Redshift, Data Pipelines, SQL Performance, PostgreSQL, Amazon S3 (AWS S3), Relational Databases, JSON, Database Performance, Azure SQL, Azure Blobs, MongoDB, DBeaver
Other
Data Engineering, Data Warehousing, Data Analysis, Data Cleaning, ETL Development, Data Modeling, ETL Testing, Schemas, Data Analytics, Analytics, Data Queries, Performance Tuning, CI/CD Pipelines, ETL Tools, Excel 365, BI Reporting, Data Transformation, Data Profiling, Dashboard Development, Data Cleansing, Information Gathering, Data Manipulation, Query Optimization, Data Warehouse Design, Statistics, Dashboards, Data Architecture, Reports, Reporting, Data Build Tool (dbt), Azure SQL Data Warehouse (SQL DW), Automated Data Flows, Business Intelligence (BI) Platforms, ELT, MRP, Knowledge Management, Minitab, Calculus, Linear Algebra, IBM Cloud, IT Service Management (ITSM), Web Scraping, SyteLine ERP, Pivot Tables, Multiprocessing, Data Visualization, Fuzzy Logic
Frameworks
Windows PowerShell
Education
Graduate Certificate in Health Data Science
University of New South Wales - Sydney, NSW, Australia
Master's Degree in Information Systems
The University of Melbourne - Melbourne, Victoria, Australia
Bachelor's Degree in Logistics and Supply Chain Management
Huazhong University of Science and Technology - Wuhan, Hubei, China
Certifications
Microsoft Certified: Azure Fundamentals
Microsoft
ITIL Foundation Certificate in IT Service Management
AXELOS