Nigel is available for hire

Nigel Chang

Verified Expert in Engineering

Data Warehousing Developer

San Francisco, CA, United States

Toptal member since April 1, 2019

Expertise

Software Development Data Warehouse Machine Learning SQL Python C Hadoop MySQL PostgreSQL JSON ETL Jira GitHub Amazon S3

Bio

Nigel is a senior software and data engineer on Cloud, Linux, AWS, GCP, Snowflake, Hadoop, and almost all computer and database platforms. He's led and contributed to eCommerce and self-driving startups as well as the world's largest brokerage, retail, semiconductor, communication, network, and storage enterprises on the data analytics, ETL data pipeline, transaction processing, self-driving, and data science teams.

Portfolio

Levi's

Amazon Web Services (AWS), PostgreSQL, AWS Glue

Bank of California

Amazon Web Services (AWS)

Saks Off Fifth

Amazon Web Services (AWS), Python, SQL, Data Build Tool (dbt), Snowflake, Oracle

Experience

Data Warehousing - 12 years
Data Warehouse Design - 12 years
Hadoop - 10 years
Python - 10 years
MySQL - 8 years
Redshift - 6 years
Amazon S3 (AWS S3) - 6 years
Google Cloud Platform (GCP) - 2 years

Preferred Environment

Amazon Web Services (AWS), Python, Redshift, Hadoop, Snowflake, Linux

The most amazing...

...thing that I've worked on as the only available data engineer was a $200 million eCommerce startup to run 25+ data pipelines and support all business teams.

Work Experience

Senior Data Engineer

2021 - PRESENT

Levi's

Built the eCommerce Loyalty Global system back end for both North America and Europe regions with AWS Aurora PostgreSQL, Glue, Python, SQL, S3, Redshift, Step Functions, Lambda, Batch, as well as GCP BigQuery as the platform and infrastructure.
Implemented, architected, and designed the entire database, data migration, and data flow from and to internal and external data sources and customers.
Designed and implemented a data engineering development ecosystem with Aurora database, GitHub, Jenkins, and Terraform.

Technologies: Amazon Web Services (AWS), PostgreSQL, AWS Glue

Senior Data Engineer

2022 - 2023

Bank of California

Built an AWS Redshift, PostgreSQL, and Glue-based reconciliation, monitoring, alerting, and analytics platform for a new payment system.
Designed and implemented all tables, indexes, views, pipelines, GitHub repository, branches, and GitHub actions for CI/CD.
Contributed to the project plan with Jira, code review, code release, and code deployment, data security, and data governance.

Technologies: Amazon Web Services (AWS)

Senior Data Engineer

2022 - 2023

Saks Off Fifth

Handled hands-on development of a brand-new data analytics platform and infrastructure using Snowflake, dbt, AWS, Airflow, and Fivetran.
Implemented, architected, and designed an entire ecosystem and the entire first phase of deliverables, including data extraction from multiple sources, ETL pipelines, Snowflake ingestion, and dbt transformation.
Handled data sources including Oracle-based POS Order, Product, eCommerce, Supply Chain, and Warehouse Management, Adobe Clickstream, Salesforce customer service, loyalty, fraud detection, audit data via Kafka, etc.

Technologies: Amazon Web Services (AWS), Python, SQL, Data Build Tool (dbt), Snowflake, Oracle

Senior Data Engineer

2020 - 2022

Amazon

Performed as a member of the workforce staffing data engineering team. Developed ETL pipelines, data mapping, modeling, data lake, and data flow to fill labor orders.
Developed Airflow DAG for tasks, operators, and connections with Python and SQL.
Worked with business intelligence engineers and a data analyst to create dashboards.

Technologies: Amazon Web Services (AWS), Amazon EC2, Amazon S3 (AWS S3), Redshift, CI/CD Pipelines, Python 3, Apache Airflow, Data Build Tool (dbt), SQL

Senior Data Engineer

2020 - 2021

PepsiCo

Took part in the eCommerce ROI data engineering team. Developed ETL pipelines, data mapping, modeling, and data flow for 20+ advertising media sources, including Nielsen, Google, Amazon, Facebook, Twitter, OMD, and more.
Developed an Airflow DAG for tasks, operators, and a connection variable that brings data from AWS S3 into Snowflake.
Developed a data vault schema and table in Snowflake. Supported the Snowflake database, role, warehouse, schema, and table.

Technologies: Apache Airflow, Snowflake, Amazon S3 (AWS S3), Docker Compose, Kubernetes Operations (kOps), Data Build Tool (dbt), Argo CD, Jira, GitHub, Python 3, SQL

Data Engineer

2019 - 2020

Cyngn

Created a self-driving car fleet management system analytics, data pipelines, and ETL.
Used AWS Redshift, EC2, S3, Python, Database Migration Service, and MongoDB.
Developed data pipelines and data flow of vehicle heartbeats, and weather API data. Fed Tableau analytics dashboards.

Technologies: Amazon Web Services (AWS), REST APIs, Jira, Tableau, Python, MySQL, MongoDB, Document Management Systems (DMS), Amazon S3 (AWS S3), Amazon EC2, Redshift, SQL

Data Engineer

2019 - 2020

Cisco

Developed a B2B customer contact hub dataset. Supported machine learning, AI, software renewal, NPS survey, and sales campaign automation.
Built a data pipeline framework, guidelines, production procedures, data architecture, and code review process. Led and educated junior Python developers.
Developed an internal Salesforce contact dataset and sync it with an external Salesforce object.
Migrated data foundation from Hadoop to Snowflake, GCP BigQuery, GCE, GCS, Airflow, and Cloud Gateway Server.

Technologies: Jira, GitHub, BigQuery, Google Cloud Storage, Google Compute Engine (GCE), Google Cloud Platform (GCP), Snowflake, JSON, Apache Hive, Hadoop, Spark SQL, PySpark, Python, Apache Airflow, Salesforce, SQL

Big Data Engineer

2017 - 2018

Western Digital

Developed and supported enterprise data management big data engineering for worldwide head and drive wafer fab production image and data ETL pipelines.
Rebuilt, managed, and tuned large production enterprise data management AWS Redshift clusters to allow large volume pipelines and user queries.
Supported AWS Redshift, Redshift Spectrum, ElasticSearch, Kinesis, S3, EC2, RDS, MySQL, PostgreSQL, Aurora, and CloudWatch. Managed Control-M, Spotfire, and SnapLogic ETL.
Supported the wafer images defect model machine learning platform.
Worked with Slack, Hadoop, Hive, Impala, Python, NumPy, SciPy, SVM, SVD, GitHub, Bitbucket, Jenkins, Tidal, Java, Jira, Wiki, and Confluence.

Technologies: Amazon Web Services (AWS), Elasticsearch, Bash, PostgreSQL, Amazon EC2, Amazon S3 (AWS S3), Python, Redshift, SQL

Lead Data Engineer

2015 - 2017

ModCloth

Developed and maintained an online shopping eCommerce data engineering, data analytics, 25 ETL pipelines, and a data warehouse as the only available data engineer.
Constructed and managed Salesforce eCommerce Cloud (aka Demandware), Square POS, eCommerce replication Percona FelexCDC, Adobe Omniture Marketing Cloud, Oracle Responsys, ScientiaMobile WURFL, Qualtrics, Zodiac, ShopKeep, Acuity, and RetailNext.
Developed data pipelines with various vendors using GitHub, Python, C/C++, Java, REST API, JSON, XML, CSV, TSV, Jira, and Slack.
Designed Azure migration of Azure SQL Data Warehouse, Blob Storage, and Linux VM.

Technologies: Amazon Web Services (AWS), PostgreSQL, MySQL, Python, Bash, Amazon EC2, Amazon S3 (AWS S3), Redshift, SQL

Software System Engineer

2002 - 2015

Charles Schwab

Built a new portfolio accounting system on Linux as the very first engineer.
Led the sparks team and built a cost basis accounting system and a reporting repository data warehouse.
Built and supported Eagle Investment Systems STAR and PACE products.
Supported and migrated the mainframe-based system to RedHat Linux/Solaris VMware server and 100TB+ scale Oracle 9/10/11/12 RAC/TAF/EMC/HDS based DataGuard/Golden Gate environments.
Developed and supported partitioning, parallel processing, ESP scheduling, high availability/failover, disaster recovery, Tivoli monitoring, Splunk, and Zenoss.
Implemented and supported both the development and production of OLTP, OLAP, ETL, distributed Messaging (MQ), iPlanet/Apache, Application Server, Oracle 9/10/11/12 RAC databases, and DataGuard.
Built and supported multiple TB scale development and performance/volume/stress testing environments.
Developed systems and applications with Java, Perl, Shell, Python, SQL, PL/SQL, and XML languages.
Educated the team with SQL and RDBMS, MySQL/SQL Server, and a data-driven documents library.

Technologies: SQL, Bash, Perl, Red Hat Linux, Oracle, Linux

Experience

Enterprise Sales AI Data Engineering

The sales AI campaign data science required a data pipeline and machine learning models. Major sources were from external REST API, Webhook, and S3. Snowflake, GCP, Hadoop, and Hive, including primarily email and phone contact. Hadoop, Spark, PySpark, Hive, Google Cloud Platform, and Snowflake were used.

The pipeline includes eight tasks: data extraction and ingestion, data deduplication, data transformation, data incremental load, data filtering, offer data generation, offer motion data generation, and data enrichment. GitHub, JSON, XML, Jira, Wiki, and Confluence. I fully and solely completed the Hadoop to Snowflake migration, as well as incubated junior engineers.

eCommerce Data Pipeline Migration

I developed and supported 25+ eCommerce products, transactions, KPI, marketing, merchandising, planning, finance, fraud detection, LTV, Square, REST API, and fulfillment data pipelines single-handedly. I migrated in-house-built transactions and ETL pipelines: AWS Redshift, MySQL, MongoDB, Postgres, EC2, S3, and data migration services to Salesforce Commerce Cloud and Azure, including Tableau Server and desktop dashboards. I pipelined 20+ marketing partners, including Google Analytics, Adobe Omniture, Oracle Responsys, and Salesforce Marketing Cloud. I also used Python, Bash, JSON, XML, Jira, and Slack.

Brokerage Portfolio Accounting System

I built a new and first Linux and Oracle-based portfolio accounting system for the largest brokerage firm on the West Coast with 16 million customers. I migrated the system from the mainframe to Linux and Oracle. I also handled cost-based accounting.

Enterprise Data Management

I built wafer data ETL pipelines at wafer factories around the world for the world's largest storage company. I used AWS Redshift, EC2, S3, Elasticsearch, JSON, Python, Teradata, SQL Server, Oracle, and Control M. I recreated all the tables in Redshift to improve performance.

Self-driving Analytics

I built data pipelines for a self-driving car company's fleet management system with real-time heartbeats, analytics dashboards, and products. I used AWS Redshift, EC2, MySQL, MongoDB, AWS DMS, Python, and JSON Tableau.

Education

1986 - 1987

Master of Science Degree in Computer Science

Indiana University - Bloomington, Indiana

1978 - 1982

Bachelor of Science Degree in Engineering

National Taiwan University - Taipei, Taiwan

Skills

Libraries/APIs

PySpark, REST APIs, NumPy, SciPy, Spark Streaming

Tools

MongoDB Shell, Jira, GitHub, Google Compute Engine (GCE), Amazon Elastic MapReduce (EMR), Cisco Tidal Enterprise Scheduler, Spark SQL, BigQuery, Docker Compose, Tableau, Google Cloud Dataproc, Google Cloud Composer, Apache Airflow, Kafka Streams, AWS Glue

Languages

Python, C, Bash, SQL, Snowflake, Java, Perl, Python 3, C++

Frameworks

Hadoop

Paradigms

ETL

Platforms

Amazon EC2, Google Cloud Platform (GCP), Linux, HubSpot, Oracle, Red Hat Linux, Amazon Web Services (AWS), Salesforce

Storage

MongoDB, Amazon S3 (AWS S3), MySQL, PostgreSQL, Apache Hive, Elasticsearch, JSON, Redshift, Google Cloud Storage, Google Cloud SQL, Google Cloud, Datastage

Other

Machine Learning, Data Warehouse Design, Software Development, Google BigQuery, eCommerce, Data Warehousing, Tableau Server, Document Management Systems (DMS), Kubernetes Operations (kOps), Data Build Tool (dbt), Argo CD, CI/CD Pipelines

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring