Sidney Park, Developer in Auckland, New Zealand
Sidney is available for hire
Hire Sidney

Sidney Park

Verified Expert  in Engineering

BI and Data Engineer and Developer

Location
Auckland, New Zealand
Toptal Member Since
February 12, 2022

Sidney is an experienced business intelligence and data engineer. His expertise includes BI dimensional modeling, data pipelines, dashboards, reports, and handling API-related data. He has served as team Scrum Master, led migration and upgrade projects, mentored team members, and was recognized as the top employee for his ability to build excellent client relationships.

Portfolio

Cognition360 Inc.
SQL, Microsoft SQL Server, Data Engineering, T-SQL (Transact-SQL)...
Greater Western Water
Azure, Azure Data Factory, Azure Databricks, Azure Data Lake, Delta Lake...
Analytical Technologies Group LLC
Microsoft Power BI, ServiceLedger, SQL, Intuit QuickBooks, Python...

Experience

Availability

Part-time

Preferred Environment

SQL, Python, Amazon Web Services (AWS), Azure, Microsoft Power BI, Tableau

The most amazing...

...thing I have done is to build an end-to-end data solution from requirement gathering, data modelling and developing data pipelines and business reports.

Work Experience

Data Engineer

2023 - 2023
Cognition360 Inc.
  • Developed stored procedures with CDC ETL processes from legacy data curation process.
  • Analyzed the legacy process and identified gaps and improvements to simplify and optimize the complicated stored procedures with critical thinking.
  • Developed a template to generate scripts to save development time dynamically.
Technologies: SQL, Microsoft SQL Server, Data Engineering, T-SQL (Transact-SQL), Data Warehousing, Azure SQL, Azure Data Factory, Azure

Senior Data Engineer

2022 - 2023
Greater Western Water
  • Developed a data quality check framework written in PySpark and reusable for different data pipelines.
  • Built a prototype of a data quality check pipeline using Great Expectations to demonstrate a use case for the team.
  • Designed a data flow architecture for a data quality framework and handled data storage and table modeling.
  • Analyzed the company's source data for different business units, built a data model, designed data integration logic, and developed a data pipeline to consolidate data according to business rules.
Technologies: Azure, Azure Data Factory, Azure Databricks, Azure Data Lake, Delta Lake, Databricks, Python, PySpark, Spark, SQL, Azure SQL, Azure SQL Databases, Azure DevOps, CI/CD Pipelines, Data Quality Management, Data Quality Analysis, Data Quality, Business Analysis, Data Lakes, Apache Spark, Data Pipelines, Data Architecture, Technical Architecture, Agile, Data Engineering, ETL Tools, Data Modeling

Data and Reporting Engineer

2022 - 2022
Analytical Technologies Group LLC
  • Analyzed the company's accounting and operational data and understood how data could be used for custom reporting requirements not provided by the applications' reporting capabilities.
  • Developed data extract process for the company's accounting data from QuickBooks via QuickBooks Desktop database and QuickBooks Online API by structuring raw data into formats that suit Power BI reporting.
  • Developed Power BI reports using complex DAX as per reporting specifications provided by continuously providing suggestions to improve the report designs based on knowledge of data acquired from data analysis.
Technologies: Microsoft Power BI, ServiceLedger, SQL, Intuit QuickBooks, Python, QuickBooks API, QuickBooks Online, Pandas, DAX, Business Analysis, Reports, Communication, API Integration, ODBC, Power Query, Power Query M, Data Architecture, Data Pipelines, Data Modeling, Analytical Dashboards

Data Engineer

2022 - 2022
BFB Pty Limited
  • Developed a Python script and SQL Server stored procedures to scan and pull JSON blobs from Azure Blob Storage, then process JSON files into the SQL Server tables, including designing the table schemas and analyzing JSON data.
  • Designed the data flow for frequently ingesting webhook JSON payloads leveraging existing technologies and Azure Python SDK.
  • Documented the solution design and instructions for further changes for the business users.
Technologies: SQL, Azure Blob Storage API, JSON, Python, Microsoft SQL Server, SQL Stored Procedures, GitHub, Data Lakes, Azure Data Lake, Technical Architecture, Data Modeling

Data Engineer

2021 - 2022
Loyalty NZ
  • Developed a data pipeline to handle complex consumer transactions and loyalty data from SFTP and API data feeds in a structured and semi-structured format using PySpark, Amazon Athena, SQL, Python, Airflow, Amazon S3, and AWS Glue.
  • Built a data pipeline to ingest real-estate market data, apply transformations and address clean-up, then present data to be used for marketing campaigns using Amazon Athena, SQL, Python, Airflow, Amazon S3, and AWS Glue.
  • Facilitated team Agile ceremonies as the team's scrum master, collected feedback constantly, and initiated a few changes to Agile practices to work better for the team.
  • Assisted business users with source API changes by testing, identifying, and assessing the impact on the organization's data platform.
Technologies: Amazon S3 (AWS S3), AWS Glue, Apache Hive, PySpark, Apache Airflow, SQL, Python, JSON, GitHub, Scrum Master, Data Engineering, Data Analysis, Amazon Athena, Spark, Amazon Web Services (AWS), Business Analysis, Data Lakes, Data Warehousing, Data Warehouse Design, Apache Spark, Data Architecture, Data Pipelines, Agile, ETL Tools, Data Modeling

Business Intelligence and Data Engineer

2021 - 2021
EROAD
  • Developed a metadata-driven ETL framework and data pipeline to extract data from D365 BYOD, Salesforce, Amazon S3, and other databases, then transform and load into EDW in Kimball dimensional modeling methodology using Matillion and Snowflake.
  • Designed Power BI workspace architecture by considering access and security by business departments and different data use cases for reporting and developed Power BI datasets and reports as per agreed reporting requirements.
  • Confirmed data and reporting requirements with different stakeholders to be extracted from correct sources with the right access. Arranged, transformed, and presented them to meet end users' expectations.
Technologies: Snowflake, ETL, Microsoft Dynamics 365, Salesforce, Azure SQL Databases, GitHub, SQL, Data Engineering, Data Analysis, Microsoft Power BI, Python, Amazon Web Services (AWS), Amazon S3 (AWS S3), Dashboards, DAX, Business Analysis, Data Warehousing, Data Warehouse Design, Geospatial Data, Reports, Microsoft Power Automate, API Integration, Power Query, Power Query M, Data Pipelines, Technical Architecture, Data Architecture, Monitoring, ETL Tools, Data Modeling, Analytical Dashboards

BI Consultant of Managed Services

2014 - 2021
Altis Consulting
  • Awarded and recognized as the best employee for contributions to excellent client relationships in 2017.
  • Led upgrades and migration projects with clients and contributed to the success by resolving unexpected issues in a given timeframe, like SQL Server migration to always-on availability, cloud to on-premises, and on-premises to cloud migrations.
  • Automated numerous processes including, but not limited to, manual monitoring and reporting tasks, e.g., spent 1.5 days' effort to develop a script that eliminated steps equivalent to two days a month by a finance user.
  • Enhanced data pipelines and significantly improved the batch ETL performances for the clients' BI environments, e.g., overnight ETL run duration was reduced from 10+ hours to 5-6 hours to meet the data availability SLA for one of the clients.
  • Onboarded numerous new clients with various technologies and tools and supported their environments by self-training in new technologies in a given timeframe.
  • Mentored other team members as senior-level consultants to suggest guidelines for issues they could not resolve easily.
  • Developed data pipelines to onboard new data required, apply transformations, and present data for reporting in Kimball dimensional modeling in clients' BI environments.
  • Created and enhanced complex finance and HR reports for clients using different reporting tools, such as Power BI, SSRS, and Tableau.
Technologies: SQL, Microsoft BI Stack, Microsoft Power BI, Tableau, Azure, Azure Data Factory, Azure Data Lake, SQL Server BI, Business Intelligence (BI), Star Schema, IBM Cognos, MicroStrategy, Oracle, SAP, SAP Business Intelligence (BI), Data Engineering, Dashboards, DAX, SQL Server DBA, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), Microsoft Excel, Microsoft SQL Server, Business Analysis, Data Warehousing, Data Warehouse Design, Excel 365, Reports, Windows Server 2016, Communication, ODBC, Power Query, Power Query M, Monitoring, Data Auditing, ETL Tools, Data Modeling, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Analytical Dashboards

EROAD Business Intelligence

With the global expansion of the business, the company needed to build a business intelligence system in order to create a single source of truth data repository for enterprise reporting and enable user self-service reporting ability.

I developed the data pipeline to ingest data from different sources and placed the data into the star-schema modeled tables in the data warehouse, following the Kimball dimension modeling methodology.

I created the PowerBI workspace environments and developed data sources with all the tables required for user self-service reporting by defining table relationships and dashboards for monitoring sales pipelines.

The technologies used were Snowflake, PowerBI, SQL, Python, Matillion, Azure DB, and AWS.

Luigi Data Pipeline

A metadata-driven data pipeline practice using Luigi in Python. Python makes it much easier to dynamically populate update scripts for source data to be updated to the data store tables. For this practice, data sources are defined as a MySQL database and a weblog file loaded into Redshift. Scripts to demo the data warehouse presentation layer in dimensional modeling are included, but it has not been incorporated into the Luigi data pipeline due to a time constraint.

Twitter Tweet Streaming Using Kafka in Python

https://github.com/sidneypark22/apache-kafka-twitter-streaming/blob/main/README.md
A small personal work to demonstrate how to stream Twitter tweets using Kafka in Python. The producer sends a Twitter API request to retrieve tweets by searching for a particular keyword. When a response is received, it remembers the last tweet ID and applies a data transformation so that it is in a format that is easier to read. The producer sends data to the consumer, which waits and consumes data in real-time. The consumer uploads the data received to S3 as files, including the last tweet ID, to be used for further processing. The last tweet ID will work as a watermark for a producer to retrieve incremental tweets the next time.

Creating a Simple Amazon S3 Data Lake Using Glue Crawler From PostgreSQL Sample Database

https://medium.com/@spa0220/creating-simple-aws-s3-data-lake-using-glue-crawler-from-postgresql-dvdrental-on-macos-e2ecc1490e27
I created a simple Amazon S3 data lake with CSV files in this project.

The CSV files are created by extracting data from the PostgreSQL DVD rental sample database publicly available.

Then Python script is used to streamline the process of extracting data from the PostgreSQL database, saving outputs to CSV files, and then uploading them to the Amazon S3 bucket in a specific folder structure.

Glue Crawler is used to scan and create table metadata on Glue Data Catalog. Once the tables are formed in the Glue Data Catalog, you can use EMR or Athena to query tables—in this case, Athena is used.

Loyalty NZ Single Source of Truth for Transactions

Loyalty NZ had different streams for ingesting transaction data and it caused transactions to be scattered over different places. I managed the project as the single resource by gathering requirements, understanding the business logic, analyzing data, identifying improvement gaps, developing the data pipelines, and training the users. I delivered the single source of truth for all the transactions for the consumer transactions related to the company's loyalty program. The principal data scientist called the project the biggest improvement ever implemented to the company's transaction data. Throughout the project, I also participated as a scrum master.

Altis Consulting Business Intelligence

Developed and supported business intelligence ETL pipelines and reports or dashboards and performed data analysis for clients in Australia, New Zealand, and the UK. I focused on Microsoft and Azure products but also covered other tools such as Tableau.

Pacific National Business Intelligence

Pacific National is one of the largest freight logistics companies in Australia. I developed and supported their business intelligence system end-to-end for over four years. My job included developing, monitoring, debugging, and supporting ETL pipelines and Tableau servers and reports, data analysis, consulting with business users, and providing recommendations.

TransGrid Business Intelligence

Designed data modeling for the company's finance-related source data, developed ETL to bring into a data warehouse, and then created business reports following complex reporting requirements. I made a range of improvements to the ETL pipeline, which enhanced data modeling design and significantly reduced ETL pipeline duration to meet SLA. Also, I implemented SQL Server upgrades from SQL Server 2008 R2 to SQL Server 2016 in a high-availability environment.

Partners Life Claim Business Intelligence

Enhanced and expanded the client's business intelligence system built for their claims data model. Also, I monitored the business system to meet the company's SLA for data availability and improved performance.

Analytical Technologies Group Finance and Operations Reporting

Clients requested to create customized business reports using their finance and operations data from their operation tools, QuickBooks Online and Service Ledger. My job included source data analysis, developing an extract process using QuickBooks Online API in Python and T-SQL to extract data from Microsoft SQL Server, building datasets in appropriate data models in Power BI, and creating customized finance and operations business reports using complex calculations.

Languages

SQL, Python, Snowflake, T-SQL (Transact-SQL), Power Query M

Tools

SQL Server BI, Microsoft Power BI, Tableau, AWS Glue, Amazon Athena, Microsoft Excel, IBM Cognos, GitHub, Apache Airflow, Amazon Elastic MapReduce (EMR), SSAS, WhereScape RED, Power Query

Paradigms

Business Intelligence (BI), ETL, ETL Implementation & Design, API Architecture, Dimensional Modeling, Agile, Azure DevOps, Kimball Methodology

Platforms

Microsoft BI Stack, Amazon Web Services (AWS), Azure, Oracle, Salesforce, Apache Kafka, Databricks, Windows Server 2016, Microsoft Power Automate, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW)

Storage

SQL Server 2012, Data Pipelines, Microsoft SQL Server, SQL Server Integration Services (SSIS), Data Lakes, Databases, Azure SQL Databases, Amazon S3 (AWS S3), JSON, SQL Server 2008, Apache Hive, Redshift, PostgreSQL, SQL Stored Procedures, SQL Server DBA, SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), Azure SQL, Azure Blobs, Oracle 11g, SQL Server 2016, SSAS Tabular, MySQL

Other

Star Schema, Data Analysis, Data Visualization, ELT, Data Engineering, Reporting, Data Analytics, ETL Tools, Data Modeling, ETL Development, Microsoft Data Transformation Services (now SSIS), Data, Analytical Dashboards, Azure Data Factory, Azure Data Lake, DAX, Excel 365, CSV Import, CSV Export, Logistics, AWS Certified Solution Architect, Reports, Communication, Business Processes, Financial Accounting, Management Accounting, Finance, Enterprise Systems, MicroStrategy, Microsoft Dynamics 365, Scrum Master, SAP, SAP Business Intelligence (BI), APIs, Data Warehousing, Data Warehouse Design, BI Reporting, Amazon RDS, Dashboards, ServiceLedger, Intuit QuickBooks, QuickBooks Online, Azure Databricks, Delta Lake, CI/CD Pipelines, Data Quality Management, Data Quality Analysis, Data Quality, Tableau Server, Web Scraping, Business Analysis, SSRS Reports, WhereScape, Data Build Tool (dbt), Geospatial Data, Microsoft 365, API Integration, Data Architecture, Technical Architecture, Monitoring, Data Auditing, Cloud, Solution Architecture

Libraries/APIs

PySpark, Luigi, Azure Blob Storage API, QuickBooks API, Pandas, ODBC, SendGrid API

Frameworks

Spark, Flask, Apache Spark

2006 - 2014

Bachelor's Degree in Accounting and Information Systems

University of Auckland - Auckland, New Zealand

JUNE 2023 - PRESENT

The Databricks Certified Data Engineer Professional

Databricks

MARCH 2023 - PRESENT

Microsoft Certified: Azure Solutions Architect Expert

Microsoft

MARCH 2023 - PRESENT

Microsoft Certified: Azure Administrator Associate

Microsoft

JUNE 2022 - PRESENT

AWS Solutions Architect – Associate

Amazon Web Services

JUNE 2022 - PRESENT

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks

MARCH 2022 - PRESENT

AWS Certified Data Analytics - Specialty

Amazon Web Services

NOVEMBER 2020 - PRESENT

Microsoft Certified: Power BI Data Analyst Associate

Microsoft

AUGUST 2020 - PRESENT

Microsoft Certified: Azure Data Engineer Associate

Microsoft

APRIL 2020 - PRESENT

Azure Fundamentals

Microsoft

APRIL 2016 - PRESENT

Microsoft Certified Professional: MCSA SQL Server 2012/2014

Microsoft

NOVEMBER 2015 - PRESENT

Stephen Few: Visual Business Intelligence Workshop

Perceptual Edge

MARCH 2015 - PRESENT

Dimensional Modelling: The Kimball Method Workshop

Kimball Group

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring