Nigel Chang, Software Developer in San Francisco, CA, United States
Nigel Chang

Software Developer in San Francisco, CA, United States

Member since March 13, 2019
Nigel is a senior data engineer, software engineer, DBA, on Cloud, Linux, AWS, GCP, Snowflake, Hadoop, and almost all database platforms. He's led and contributed to eCommerce and self-driving startups as well as the world's largest brokerage, retail, semiconductor, communication, network, and storage enterprises on the data analytics, ETL data pipeline, transaction processing, self-driving, and data science teams.
Nigel is now available for hire

Portfolio

  • Cisco
    Python, PySpark, Sparks SQL, Hadoop, Hive, JSON...
  • Western Digital
    AWS, Redshift, Python, S3, EC2, Postgres, Bash, Elastic Search
  • ModCloth
    AWS, Redshift, S3, EC2, Bash, Python, MySQL, Postgres

Experience

  • SQL, 20 years
  • AWS EMR, 5 years
  • AWS S3, 5 years
  • Hadoop, 5 years
  • PySpark, 3 years
  • Apache Hive, 2 years

Location

San Francisco, CA, United States

Availability

Part-time

Preferred Environment

Linux, AWS, GCP, Hadoop, Redshift, Python

The most amazing...

...thing that I've worked on as the only available data engineer was a $200 million eCommerce startup to run 25+ data pipelines and support all business teams.

Employment

  • Data Engineer

    2019 - PRESENT
    Cisco
    • Supported machine learning, AI, and sales campaign automation.
    • Built data pipeline framework, guidelines, production procedures, data architecture, and code review process.
    • Led junior Python and PySpark developers.
    Technologies: Python, PySpark, Sparks SQL, Hadoop, Hive, JSON, Google Cloud Platform. GitHub, Tidal, JIRA
  • Big Data Engineer

    2017 - 2018
    Western Digital
    • Develop and support Enterprise Data Management Big Data Engineering, world-wide head and drive wafer fab production image and data ETL pipelines.
    • Rebuild, manage and tune large production Enterprise Data Management AWS Redshift clusters to allow large volume pipelines and user queries.
    • Support AWS Redshift, Redshift Spectrum, ElasticSearch, Kinesis, S3, EC2, RDS, MySQL, PostgreSQL, Aurora, CloudWatch. Manage Control-M, Spotfire, SnapLogic ETL.
    • Support wafer images defect model Machine Learning platform.
    • Tooled Slack, Hadoop, Hive, Impala, Python, numpy, scipy, SVM, SVD, GitHib, BitBucket, Jenkins, Tidal, Java, JIRA, wiki, Confluence.
    Technologies: AWS, Redshift, Python, S3, EC2, Postgres, Bash, Elastic Search
  • Lead Data Engineer

    2015 - 2017
    ModCloth
    • Developed and maintained an online shopping eCommerce data engineering, data analytics, 25 ETL pipelines, and data warehouse as the only available data engineer.
    • Constructed and managed Salesforce E-commerce Cloud (aka DemandWare), Square POS, E-commerce replication Percona FelexCDC, Adobe Omniture Marketing Cloud, Oracle Responsys, ScientiaMobile WURFL, Qualtrics, Zodiac, ShopKeep, Acuity, and RetailNext.
    • Developed data pipelines with various vendors using GitHub, Python, C/C++, JAVA, REST API, JSON, XML, CSV, TSV, JIRA, Slack.
    • Designed Azure migration of Azure SQL Data Warehouse, Blob Storage, and Linux VM.
    Technologies: AWS, Redshift, S3, EC2, Bash, Python, MySQL, Postgres
  • Software System Engineer

    2002 - 2015
    Charles Schwab
    • Built new portfolio accounting system on Linux as the very first engineer.
    • Led SPARKS team and built Cost Basis Accounting System, Reporting Repository Data Warehouse.
    • Built and supported Eagle Investment Systems STAR and PACE products.
    • Supported and migrated mainframe based system to RedHat Linux/Solaris VMware server and 100TB+ scale Oracle 9/10/11/12 RAC/TAF/EMC/HDS based DataGuard/Golden Gate environments.
    • Developed and supported partitioning, parallel processing, ESP scheduling, high availability/failover, disaster recovery, Tivoli monitoring, Splunk, and Zenoss.
    • Implemented and supported both development and production OLTP, OLAP, ETL, distributed Messaging (MQ), iPlanet/Apache, Application Server, Oracle 9/10/11/12 RAC databases, and DataGuard.
    • Built and supported multiple TB scale development and performance/Volume/Stress testing environments.
    • Developed systems and applications with JAVA, Perl, Shell, Python, SQL, PL/SQL, and XML languages.
    • Educated team with SQL and RDBMS, MySQL/SQL Server, and Data Driven Documents library.
    Technologies: Linux, Oracle, RedHat, Perl, Bash, SQL

Experience

  • Enterprise Sales AI Data Engineering (Development)

    Sales AI campaign data science required data pipeline and machine learning models. Major sources from external REST API, Webhook, and S3. Snowflake, GCP, Hadoop, and Hive. Primarily email and phone contact. Hadoop, Spark, PySpark, Hive, Google Could Platform, and Snowflake.

    The pipeline includes eight tasks: data extraction and ingestion, data deduplication, data transformation, data incremental load, data filtering, offer data generation, offer motion data generation, and data enrichment. GitHub, JSON, XML, JIRA, Wiki, and Confluence. Fully and solely completed Hadoop to Snowflake migration. Incubated junior engineers.

  • eCommerce Data Pipeline Migration (Development)

    Developed and supported 25+ eCommerce product, transaction, KPI, marketing, merchandising, planning, finance, fraud detection, LTV, Square, REST API, and fulfillment data pipelines single-handed. Migrated in-house built transaction and ETL pipelines AWS Redshift, MySQL, MongoDB, Postgres, EC2, S3, Data Migration Services to Salesforce Commerce Cloud and Azure. Tableau Server and desktop dashboards. Pipelined 20+ marketing partners including Google Analytics, Adobe Omniture, Oracle Responsys, and Salesforce Marketing Cloud. Python, Bash, JSON, XML, JIRA, Slack.

  • Brokerage Portfolio Accounting System (Development)

    Build a new and first Linux and Oracle-based portfolio accounting system for the largest brokerage firm in the West Coast with 16 million customers.

  • Enterprise Data Management (Development)

    Built wafer data ETL pipelines from wafer factories all over the world for the world's largest storage company. AWS Redshift, EC2, S3, Elastic Search, JSON, Python, Teradata, SQL Server, Oracle, and Control M. Re-created all the tables in Redshift to make it perform.

  • Self-Friving Analytics (Development)

    Built data pipelines for self-driving car company fleet management system with real-time heartbeats, analytics dashboards, and products. AWS Redshift, EC2, MySQL, MongoDB. AWS DMS, Python, and JSON Tableau.

Skills

  • Languages

    SQL, XML, Python, Java, C, C++, Perl, Bash
  • Frameworks

    Hadoop, AWS EMR
  • Tools

    Cisco Tidal Enterprise Scheduler, Tableau, Jira, Slack, GitHub
  • Storage

    AWS S3, Apache Hive, Elasticsearch, MySQL, PostgreSQL, AWS RDS, JSON, Redshift
  • Other

    Software Development, Tableau Server, CSV
  • Libraries/APIs

    PySpark, REST APIs, NumPy, SciPy
  • Platforms

    Linux, AWS EC2, Oracle, Talend

Education

  • Master of Science degree in Computer Science
    1986 - 1987
    Indiana University - Bloomington, Indiana
  • Bachelor of Science degree in Engineering
    1978 - 1982
    National Taiwan University - Taipei, Taiwan

To view more profiles

Join Toptal
I really like this profile
Share it with others