Shawn Xiao, Azure Data Factory Developer in Auckland, New Zealand
Shawn Xiao

Azure Data Factory Developer in Auckland, New Zealand

Member since August 3, 2020
Shawn has been working on data management and data analytics for different industries for the last 15 years. He is also a Microsoft Certified Solutions Expert for data management and analytics, familiar with various technologies such as Azure, AWS, big data, Spark, SQL, Hadoop, BI, DW, and Tableau. Shawn has strong problem-solving and root cause analysis skills.
Shawn is now available for hire

Portfolio

  • Woolworths New Zealand
    Docker, Windows PowerShell, SQL Azure, SQL
  • Plexure
    Amazon Web Services (AWS), Azure Data Factory, Azure SQL Databases, Redshift...
  • Altis Consulting
    Azure Data Factory, ETL Tools, SQL Server Integration Services (SSIS), Python...

Experience

Location

Auckland, New Zealand

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), AWS, Azure, Big Data, Python, SQL

The most amazing...

...thing I've optimized and improved is an E2E overnight data process solution, reducing the time from 16 hours to 4.5 hours.

Employment

  • Senior Database Specialist

    2020 - PRESENT
    Woolworths New Zealand
    • Upgraded and migrated an on-premises SQL Server instance to Azure SQL Virtual Machines to enable the server to scale up for better user shopping online experience for about two million customers.
    • Conducted optimization and performance tuning of a SQL Server instance. Refactored a query to reduce the execution time of a query from two minutes to 20 seconds.
    • Containerized the SQL database instance to increase the developers' productivity.
    Technologies: Docker, Windows PowerShell, SQL Azure, SQL
  • Senior Data Engineer

    2019 - 2020
    Plexure
    • Optimized the E2E data process and pipeline to reduce the overnight load from 16.5 hours to 4.5 hours, enabling presenting the reports to our key external business partners.
    • Designed and developed data pipelines on Azure using Azure Data Factory to process 200 million users' sales data in order to meet the business's needs.
    • Designed and developed data pipelines running on AWS using Pyspark, Glue, EMR, Lambda, and Redshift to process mobile app data for the key external customers.
    Technologies: Amazon Web Services (AWS), Azure Data Factory, Azure SQL Databases, Redshift, AWS Lambda, AWS EMR, AWS Glue, PySpark, AWS, SQL, Azure
  • Senior Managed Services BI Consultant

    2016 - 2019
    Altis Consulting
    • Developed the data pipelines to extract, load, and transform 1,000,000 sales transaction data daily from CSV to Redshift database using AWS EMR and Lambda for a fast-food channel brand.
    • Developed the data pipelines to extract, load, and transform signal data every five minutes sent via the signal tower API to the Azure SQL database, using Azure Data Factory to track the performance of the towers countrywide.
    • Developed and maintained business intelligence solutions running on SSIS, SAP Data Services, and IBM Cognos for different companies within industries such as utilities, university, and transportation.
    Technologies: Azure Data Factory, ETL Tools, SQL Server Integration Services (SSIS), Python, AWS EMR, AWS Lambda, SQL, Azure
  • Database Developer

    2013 - 2016
    Fisher & Paykel Healthcare Corporation, Ltd.
    • Planned, designed, and implemented the SQL Server Enterprise platform (2012, 2014) to support $100 million business growth for the next three years.
    • Designed and implemented a backup and restore strategy, data security strategy, data storage strategy, high availability (HA), and disaster recovery (DR) strategy.
    • Monitored, troubleshot, and optimized the database system running on MSSQL Server (2000, 2005, 2008, 2012, 2014) and Azure SQL.
    Technologies: SAP BusinessObjects Data Service (BODS), SQL Server 2012, SQL
  • BI Developer | Data Warehouse Developer

    2010 - 2013
    Microsoft
    • Built and implemented different BI solutions for WoS reports, which helped the VP of the department track the inventory and forecast some products like Xbox One or Surface.
    • Designed and implemented an E2E solution for a system to help supplier chain users track sales and delivery status for MS products (Surface). Designed and developed extraction, transformation, and load (ETL) to support data integration needs.
    • Developed and optimized the database applications in SQL Server. Performed administrative tasks of the BI and ETL tools and assisted in the deployment of the application code.
    Technologies: Business Intelligence (BI), SQL Server 2012, SQL

Experience

  • Data Pipeline Development

    Designed and developed a data pipeline to extract, load, and transform data from Azure to AWS.

    Developed the data pipeline using Azure Data Factory to extract data from Azure Table storage and land it to Azure Blob storage. Azure Function triggered to copy it for AWS S3, and AWS Glue job triggered Lambda to write to the S3 sink bucket, resulting in an overnight job picking up the data and writing it into the Redshift database.

Skills

  • Languages

    SQL, Python
  • Paradigms

    ETL, Database Development, Business Intelligence (BI)
  • Storage

    SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Redshift, Azure SQL Databases, AWS S3, Data Pipelines, Azure Table Storage, SQL Server 2012, SQL Server 2014
  • Other

    Data Management, Performance Tuning, Big Data, Azure Data Factory, ETL Tools, Data Warehousing, Data Engineering, APIs, Data Modeling, Reporting, Data Analytics, AWS, Azure Data Lake, Data Warehouse Design, SAP BusinessObjects Data Service (BODS), Computer Science
  • Frameworks

    Windows PowerShell, AWS EMR, Apache Spark
  • Libraries/APIs

    PySpark
  • Tools

    AWS Glue, Microsoft Power BI, Apache Airflow
  • Platforms

    Azure, Amazon Web Services (AWS), Docker, AWS Lambda

Education

  • Bachelor's degree in Computer Science
    2006 - 2009
    Shenzhen Open University - Shenzhen China

Certifications

  • Apache Spark Big Data and Python
    MARCH 2020 - PRESENT
    Databricks, Inc.
  • MCSE - Data Management and Analytics
    SEPTEMBER 2017 - PRESENT
    Microsoft

To view more profiles

Join Toptal
Share it with others