Verified Expert in Engineering
BI and Data Engineer and Developer
Sidney is an experienced business intelligence and data engineer. His expertise includes BI dimensional modeling, data pipelines, dashboards, reports, and handling API-related data. He has served as team Scrum Master, led migration and upgrade projects, mentored team members, and was recognized as the top employee for his ability to build excellent client relationships.
SQL, Python, Amazon Web Services (AWS), Azure, Microsoft Power BI, Tableau
The most amazing...
...thing I have done is to build an end-to-end data solution from requirement gathering, data modelling and developing data pipelines and business reports.
- Developed stored procedures with CDC ETL processes from legacy data curation process.
- Analyzed the legacy process and identified gaps and improvements to simplify and optimize the complicated stored procedures with critical thinking.
- Developed a template to generate scripts to save development time dynamically.
Senior Data Engineer
Greater Western Water
- Developed a data quality check framework written in PySpark and reusable for different data pipelines.
- Built a prototype of a data quality check pipeline using Great Expectations to demonstrate a use case for the team.
- Designed a data flow architecture for a data quality framework and handled data storage and table modeling.
- Analyzed the company's source data for different business units, built a data model, designed data integration logic, and developed a data pipeline to consolidate data according to business rules.
Data and Reporting Engineer
Analytical Technologies Group LLC
- Analyzed the company's accounting and operational data and understood how data could be used for custom reporting requirements not provided by the applications' reporting capabilities.
- Developed data extract process for the company's accounting data from QuickBooks via QuickBooks Desktop database and QuickBooks Online API by structuring raw data into formats that suit Power BI reporting.
- Developed Power BI reports using complex DAX as per reporting specifications provided by continuously providing suggestions to improve the report designs based on knowledge of data acquired from data analysis.
BFB Pty Limited
- Developed a Python script and SQL Server stored procedures to scan and pull JSON blobs from Azure Blob Storage, then process JSON files into the SQL Server tables, including designing the table schemas and analyzing JSON data.
- Designed the data flow for frequently ingesting webhook JSON payloads leveraging existing technologies and Azure Python SDK.
- Documented the solution design and instructions for further changes for the business users.
- Developed a data pipeline to handle complex consumer transactions and loyalty data from SFTP and API data feeds in a structured and semi-structured format using PySpark, Amazon Athena, SQL, Python, Airflow, Amazon S3, and AWS Glue.
- Built a data pipeline to ingest real-estate market data, apply transformations and address clean-up, then present data to be used for marketing campaigns using Amazon Athena, SQL, Python, Airflow, Amazon S3, and AWS Glue.
- Facilitated team Agile ceremonies as the team's scrum master, collected feedback constantly, and initiated a few changes to Agile practices to work better for the team.
- Assisted business users with source API changes by testing, identifying, and assessing the impact on the organization's data platform.
Business Intelligence and Data Engineer
- Developed a metadata-driven ETL framework and data pipeline to extract data from D365 BYOD, Salesforce, Amazon S3, and other databases, then transform and load into EDW in Kimball dimensional modeling methodology using Matillion and Snowflake.
- Designed Power BI workspace architecture by considering access and security by business departments and different data use cases for reporting and developed Power BI datasets and reports as per agreed reporting requirements.
- Confirmed data and reporting requirements with different stakeholders to be extracted from correct sources with the right access. Arranged, transformed, and presented them to meet end users' expectations.
BI Consultant of Managed Services
- Awarded and recognized as the best employee for contributions to excellent client relationships in 2017.
- Led upgrades and migration projects with clients and contributed to the success by resolving unexpected issues in a given timeframe, like SQL Server migration to always-on availability, cloud to on-premises, and on-premises to cloud migrations.
- Automated numerous processes including, but not limited to, manual monitoring and reporting tasks, e.g., spent 1.5 days' effort to develop a script that eliminated steps equivalent to two days a month by a finance user.
- Enhanced data pipelines and significantly improved the batch ETL performances for the clients' BI environments, e.g., overnight ETL run duration was reduced from 10+ hours to 5-6 hours to meet the data availability SLA for one of the clients.
- Onboarded numerous new clients with various technologies and tools and supported their environments by self-training in new technologies in a given timeframe.
- Mentored other team members as senior-level consultants to suggest guidelines for issues they could not resolve easily.
- Developed data pipelines to onboard new data required, apply transformations, and present data for reporting in Kimball dimensional modeling in clients' BI environments.
- Created and enhanced complex finance and HR reports for clients using different reporting tools, such as Power BI, SSRS, and Tableau.
EROAD Business Intelligence
I developed the data pipeline to ingest data from different sources and placed the data into the star-schema modeled tables in the data warehouse, following the Kimball dimension modeling methodology.
I created the PowerBI workspace environments and developed data sources with all the tables required for user self-service reporting by defining table relationships and dashboards for monitoring sales pipelines.
The technologies used were Snowflake, PowerBI, SQL, Python, Matillion, Azure DB, and AWS.
Luigi Data Pipeline
Twitter Tweet Streaming Using Kafka in Pythonhttps://github.com/sidneypark22/apache-kafka-twitter-streaming/blob/main/README.md
Creating a Simple Amazon S3 Data Lake Using Glue Crawler From PostgreSQL Sample Databasehttps://medium.com/@spa0220/creating-simple-aws-s3-data-lake-using-glue-crawler-from-postgresql-dvdrental-on-macos-e2ecc1490e27
The CSV files are created by extracting data from the PostgreSQL DVD rental sample database publicly available.
Then Python script is used to streamline the process of extracting data from the PostgreSQL database, saving outputs to CSV files, and then uploading them to the Amazon S3 bucket in a specific folder structure.
Glue Crawler is used to scan and create table metadata on Glue Data Catalog. Once the tables are formed in the Glue Data Catalog, you can use EMR or Athena to query tables—in this case, Athena is used.
Loyalty NZ Single Source of Truth for Transactions
Altis Consulting Business Intelligence
Pacific National Business Intelligence
TransGrid Business Intelligence
Partners Life Claim Business Intelligence
Analytical Technologies Group Finance and Operations Reporting
SQL, Python, Snowflake, T-SQL (Transact-SQL), Power Query M
SQL Server BI, Microsoft Power BI, Tableau, AWS Glue, Amazon Athena, Microsoft Excel, IBM Cognos, GitHub, Apache Airflow, Amazon Elastic MapReduce (EMR), SSAS, WhereScape RED, Power Query
Business Intelligence (BI), ETL, ETL Implementation & Design, API Architecture, Dimensional Modeling, Agile, Azure DevOps, Kimball Methodology
Microsoft BI Stack, Amazon Web Services (AWS), Azure, Oracle, Salesforce, Apache Kafka, Databricks, Windows Server 2016
SQL Server 2012, Data Pipelines, Microsoft SQL Server, SQL Server Integration Services (SSIS), Data Lakes, Databases, Azure SQL Databases, Amazon S3 (AWS S3), JSON, SQL Server 2008, Apache Hive, Redshift, PostgreSQL, SQL Stored Procedures, SQL Server DBA, SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), Azure SQL, Azure Blobs, Oracle 11g, SQL Server 2016, SSAS Tabular, MySQL
Star Schema, Data Analysis, Data Visualization, ELT, Data Engineering, Reporting, Data Analytics, ETL Tools, Data Modeling, ETL Development, Microsoft Data Transformation Services (now SSIS), Data, Analytical Dashboards, Azure Data Factory, Azure Data Lake, DAX, Excel 365, CSV Import, CSV Export, Logistics, AWS Certified Solution Architect, Reports, Communication, Business Processes, Financial Accounting, Management Accounting, Finance, Enterprise Systems, MicroStrategy, Microsoft Dynamics 365, Scrum Master, SAP, SAP Business Intelligence (BI), APIs, Data Warehousing, Data Warehouse Design, BI Reporting, Amazon RDS, Dashboards, ServiceLedger, Intuit QuickBooks, QuickBooks Online, Azure Databricks, Delta Lake, CI/CD Pipelines, Data Quality Management, Data Quality Analysis, Data Quality, Tableau Server, Web Scraping, Business Analysis, SSRS Reports, WhereScape, Data Build Tool (dbt), Geospatial Data, Microsoft 365, Microsoft Power Automate, API Integration, Data Architecture, Technical Architecture, Monitoring, Data Auditing, Cloud, Solution Architecture, Azure SQL Data Warehouse (SQL DW)
PySpark, Luigi, Azure Blob Storage API, QuickBooks API, Pandas, ODBC, SendGrid API
Spark, Flask, Apache Spark
Bachelor's Degree in Accounting and Information Systems
University of Auckland - Auckland, New Zealand
The Databricks Certified Data Engineer Professional
Microsoft Certified: Azure Solutions Architect Expert
Microsoft Certified: Azure Administrator Associate
AWS Solutions Architect – Associate
Amazon Web Services
Databricks Certified Associate Developer for Apache Spark 3.0
AWS Certified Data Analytics - Specialty
Amazon Web Services
Microsoft Certified: Power BI Data Analyst Associate
Microsoft Certified: Azure Data Engineer Associate
Microsoft Certified Professional: MCSA SQL Server 2012/2014
Stephen Few: Visual Business Intelligence Workshop
Dimensional Modelling: The Kimball Method Workshop