Nigel Chang, Data Warehouse Design Developer in San Francisco, CA, United States
Nigel Chang

Data Warehouse Design Developer in San Francisco, CA, United States

Member since April 1, 2019
Nigel is a senior software and data engineer on Cloud, Linux, AWS, GCP, Snowflake, Hadoop, and almost all computer and database platforms. He's led and contributed to eCommerce and self-driving startups as well as the world's largest brokerage, retail, semiconductor, communication, network, and storage enterprises on the data analytics, ETL data pipeline, transaction processing, self-driving, and data science teams.
Nigel is now available for hire

Portfolio

  • Amazon
    Amazon Web Services (AWS), Amazon EC2, Amazon S3 (AWS S3), Redshift...
  • PepsiCo
    Apache Airflow, Snowflake, Amazon S3 (AWS S3), Docker Compose...
  • Cyngn
    Amazon Web Services (AWS), REST APIs, Jira, Tableau, Python, MySQL, MongoDB...

Experience

Location

San Francisco, CA, United States

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Python, Redshift, Hadoop, Snowflake, Linux

The most amazing...

...thing that I've worked on as the only available data engineer was a $200 million eCommerce startup to run 25+ data pipelines and support all business teams.

Employment

  • Senior Data Engineer

    2020 - PRESENT
    Amazon
    • Performed as a member of the Workforce Staffing data engineering team. Developed ETL pipelines, data mapping, modeling, data lake, and data flow to fill labor order.
    • Developed Airflow DAG for tasks, operator, and connection with Python and SQL.
    • Worked with business intelligence engineers and a data analyst to create dashboards.
    Technologies: Amazon Web Services (AWS), Amazon EC2, Amazon S3 (AWS S3), Redshift, CI/CD Pipelines, Python 3, Apache Airflow, Data Build Tool (dbt), SQL
  • Senior Data Engineer

    2020 - 2021
    PepsiCo
    • Took part in the eCommerce ROI data engineering team. Developed ETL pipelines, data mapping, modeling, and data flow for 20+ advertising media sources, including Nielsen, Google, Amazon, Facebook, Twitter, OMD, and more.
    • Developed an Airflow DAG for tasks, operator, and a connection variable that brings data from AWS S3 into Snowflake.
    • Developed a data vault schema and table in Snowflake. Supported the Snowflake database, role, warehouse, schema, and table.
    Technologies: Apache Airflow, Snowflake, Amazon S3 (AWS S3), Docker Compose, Kubernetes Operations (Kops), Data Build Tool (dbt), Argo CD, Jira, GitHub, Python 3, SQL
  • Data Engineer

    2019 - 2020
    Cyngn
    • Created a self-driving car fleet management system analytics, data pipelines, and ETL.
    • Used AWS Redshift, EC2, S3, Python, Database Migration Service, and MongoDB.
    • Developed data pipelines and data flow of vehicle heartbeats, and weather API data. Fed Tableau analytics dashboards.
    Technologies: Amazon Web Services (AWS), REST APIs, Jira, Tableau, Python, MySQL, MongoDB, Document Management Systems (DMS), Amazon S3 (AWS S3), Amazon EC2, Redshift, SQL
  • Data Engineer

    2019 - 2020
    Cisco
    • Developed a B2B customer contact hub dataset. Supported machine learning, AI, software renewal, NPS survey, and sales campaign automation.
    • Built a data pipeline framework, guidelines, production procedures, data architecture, and code review process. Led and educated junior Python developers.
    • Developed an internal Salesforce contact dataset and sync it with an external Salesforce object.
    • Migrated data foundation from Hadoop to Snowflake, GCP BigQuery, GCE, GCS, Airflow, and Cloud Gateway Server.
    Technologies: Jira, GitHub, BigQuery, Google Cloud Storage, Google Compute Engine (GCE), Google Cloud Platform (GCP), Snowflake, JSON, Apache Hive, Hadoop, Spark SQL, PySpark, Python, Apache Airflow, Salesforce, SQL
  • Big Data Engineer

    2017 - 2018
    Western Digital
    • Developed and supported enterprise data management big data engineering for worldwide head and drive wafer fab production image and data ETL pipelines.
    • Rebuilt, managed, and tuned large production enterprise data management AWS Redshift clusters to allow large volume pipelines and user queries.
    • Supported AWS Redshift, Redshift Spectrum, ElasticSearch, Kinesis, S3, EC2, RDS, MySQL, PostgreSQL, Aurora, and CloudWatch. Managed Control-M, Spotfire, and SnapLogic ETL.
    • Supported the wafer images defect model machine learning platform.
    • Worked with Slack, Hadoop, Hive, Impala, Python, NumPy, SciPy, SVM, SVD, GitHub, Bitbucket, Jenkins, Tidal, Java, Jira, Wiki, and Confluence.
    Technologies: Amazon Web Services (AWS), Elasticsearch, Bash, PostgreSQL, Amazon EC2, Amazon S3 (AWS S3), Python, Redshift, SQL
  • Lead Data Engineer

    2015 - 2017
    ModCloth
    • Developed and maintained an online shopping eCommerce data engineering, data analytics, 25 ETL pipelines, and a data warehouse as the only available data engineer.
    • Constructed and managed Salesforce eCommerce Cloud (aka Demandware), Square POS, eCommerce replication Percona FelexCDC, Adobe Omniture Marketing Cloud, Oracle Responsys, ScientiaMobile WURFL, Qualtrics, Zodiac, ShopKeep, Acuity, and RetailNext.
    • Developed data pipelines with various vendors using GitHub, Python, C/C++, Java, REST API, JSON, XML, CSV, TSV, Jira, and Slack.
    • Designed Azure migration of Azure SQL Data Warehouse, Blob Storage, and Linux VM.
    Technologies: Amazon Web Services (AWS), PostgreSQL, MySQL, Python, Bash, Amazon EC2, Amazon S3 (AWS S3), Redshift, SQL
  • Software System Engineer

    2002 - 2015
    Charles Schwab
    • Built a new portfolio accounting system on Linux as the very first engineer.
    • Led the sparks team and built a cost basis accounting system and a reporting repository data warehouse.
    • Built and supported Eagle Investment Systems STAR and PACE products.
    • Supported and migrated the mainframe-based system to RedHat Linux/Solaris VMware server and 100TB+ scale Oracle 9/10/11/12 RAC/TAF/EMC/HDS based DataGuard/Golden Gate environments.
    • Developed and supported partitioning, parallel processing, ESP scheduling, high availability/failover, disaster recovery, Tivoli monitoring, Splunk, and Zenoss.
    • Implemented and supported both the development and production of OLTP, OLAP, ETL, distributed Messaging (MQ), iPlanet/Apache, Application Server, Oracle 9/10/11/12 RAC databases, and DataGuard.
    • Built and supported multiple TB scale development and performance/volume/stress testing environments.
    • Developed systems and applications with Java, Perl, Shell, Python, SQL, PL/SQL, and XML languages.
    • Educated the team with SQL and RDBMS, MySQL/SQL Server, and a data-driven documents library.
    Technologies: SQL, Bash, Perl, Red Hat Linux, Oracle, Linux

Experience

  • Enterprise Sales AI Data Engineering

    Sales AI campaign data science required data pipeline and machine learning models. Major sources from external REST API, Webhook, and S3. Snowflake, GCP, Hadoop, and Hive. Primarily email and phone contact. Hadoop, Spark, PySpark, Hive, Google Could Platform, and Snowflake.

    The pipeline includes eight tasks: data extraction and ingestion, data deduplication, data transformation, data incremental load, data filtering, offer data generation, offer motion data generation, and data enrichment. GitHub, JSON, XML, JIRA, Wiki, and Confluence. Fully and solely completed Hadoop to Snowflake migration. Incubated junior engineers.

  • eCommerce Data Pipeline Migration

    Developed and supported 25+ eCommerce product, transaction, KPI, marketing, merchandising, planning, finance, fraud detection, LTV, Square, REST API, and fulfillment data pipelines single-handed. Migrated in-house built transaction and ETL pipelines AWS Redshift, MySQL, MongoDB, Postgres, EC2, S3, Data Migration Services to Salesforce Commerce Cloud and Azure. Tableau Server and desktop dashboards. Pipelined 20+ marketing partners including Google Analytics, Adobe Omniture, Oracle Responsys, and Salesforce Marketing Cloud. Python, Bash, JSON, XML, JIRA, Slack.

  • Brokerage Portfolio Accounting System

    Build a new and first Linux and Oracle-based portfolio accounting system for the largest brokerage firm in the West Coast with 16 million customers.

  • Enterprise Data Management

    Built wafer data ETL pipelines from wafer factories all over the world for the world's largest storage company. AWS Redshift, EC2, S3, Elastic Search, JSON, Python, Teradata, SQL Server, Oracle, and Control M. Re-created all the tables in Redshift to make it perform.

  • Self-Friving Analytics

    Built data pipelines for self-driving car company fleet management system with real-time heartbeats, analytics dashboards, and products. AWS Redshift, EC2, MySQL, MongoDB. AWS DMS, Python, and JSON Tableau.

Skills

  • Languages

    Python, C, Bash, SQL, Snowflake, Java, Perl, Python 3, C++
  • Frameworks

    Hadoop, AWS EMR
  • Libraries/APIs

    PySpark, REST APIs, NumPy, SciPy, Spark Streaming
  • Tools

    MongoDB Shell, Jira, GitHub, Google Compute Engine (GCE), Cisco Tidal Enterprise Scheduler, Spark SQL, BigQuery, Docker Compose, Tableau, Google Cloud Dataproc, Google Cloud Composer, Apache Airflow, Kafka Streams
  • Paradigms

    ETL
  • Platforms

    Amazon EC2, Google Cloud Platform (GCP), Linux, Oracle, Red Hat Linux, Amazon Web Services (AWS), Salesforce
  • Storage

    MongoDB, Amazon S3 (AWS S3), MySQL, PostgreSQL, Apache Hive, Elasticsearch, JSON, Redshift, Google Cloud Storage, Google Cloud SQL, Datastage
  • Other

    Machine Learning, Data Warehouse Design, Software Development, Google BigQuery, eCommerce, Data Warehousing, Tableau Server, Document Management Systems (DMS), Kubernetes Operations (Kops), Data Build Tool (dbt), Argo CD, CI/CD Pipelines

Education

  • Master of Science Degree in Computer Science
    1986 - 1987
    Indiana University - Bloomington, Indiana
  • Bachelor of Science Degree in Engineering
    1978 - 1982
    National Taiwan University - Taipei, Taiwan

To view more profiles

Join Toptal
Share it with others