Senior Data Engineer
2020 - PRESENTAmazon- Performed as a member of the Workforce Staffing data engineering team. Developed ETL pipelines, data mapping, modeling, data lake, and data flow to fill labor order.
- Developed Airflow DAG for tasks, operator, and connection with Python and SQL.
- Worked with business intelligence engineers and a data analyst to create dashboards.
Technologies: Amazon Web Services (AWS), Amazon EC2, Amazon S3 (AWS S3), Redshift, CI/CD Pipelines, Python 3, Apache Airflow, Data Build Tool (dbt), SQLSenior Data Engineer
2020 - 2021PepsiCo- Took part in the eCommerce ROI data engineering team. Developed ETL pipelines, data mapping, modeling, and data flow for 20+ advertising media sources, including Nielsen, Google, Amazon, Facebook, Twitter, OMD, and more.
- Developed an Airflow DAG for tasks, operator, and a connection variable that brings data from AWS S3 into Snowflake.
- Developed a data vault schema and table in Snowflake. Supported the Snowflake database, role, warehouse, schema, and table.
Technologies: Apache Airflow, Snowflake, Amazon S3 (AWS S3), Docker Compose, Kubernetes Operations (Kops), Data Build Tool (dbt), Argo CD, Jira, GitHub, Python 3, SQLData Engineer
2019 - 2020Cyngn- Created a self-driving car fleet management system analytics, data pipelines, and ETL.
- Used AWS Redshift, EC2, S3, Python, Database Migration Service, and MongoDB.
- Developed data pipelines and data flow of vehicle heartbeats, and weather API data. Fed Tableau analytics dashboards.
Technologies: Amazon Web Services (AWS), REST APIs, Jira, Tableau, Python, MySQL, MongoDB, Document Management Systems (DMS), Amazon S3 (AWS S3), Amazon EC2, Redshift, SQLData Engineer
2019 - 2020Cisco- Developed a B2B customer contact hub dataset. Supported machine learning, AI, software renewal, NPS survey, and sales campaign automation.
- Built a data pipeline framework, guidelines, production procedures, data architecture, and code review process. Led and educated junior Python developers.
- Developed an internal Salesforce contact dataset and sync it with an external Salesforce object.
- Migrated data foundation from Hadoop to Snowflake, GCP BigQuery, GCE, GCS, Airflow, and Cloud Gateway Server.
Technologies: Jira, GitHub, BigQuery, Google Cloud Storage, Google Compute Engine (GCE), Google Cloud Platform (GCP), Snowflake, JSON, Apache Hive, Hadoop, Spark SQL, PySpark, Python, Apache Airflow, Salesforce, SQLBig Data Engineer
2017 - 2018Western Digital- Developed and supported enterprise data management big data engineering for worldwide head and drive wafer fab production image and data ETL pipelines.
- Rebuilt, managed, and tuned large production enterprise data management AWS Redshift clusters to allow large volume pipelines and user queries.
- Supported AWS Redshift, Redshift Spectrum, ElasticSearch, Kinesis, S3, EC2, RDS, MySQL, PostgreSQL, Aurora, and CloudWatch. Managed Control-M, Spotfire, and SnapLogic ETL.
- Supported the wafer images defect model machine learning platform.
- Worked with Slack, Hadoop, Hive, Impala, Python, NumPy, SciPy, SVM, SVD, GitHub, Bitbucket, Jenkins, Tidal, Java, Jira, Wiki, and Confluence.
Technologies: Amazon Web Services (AWS), Elasticsearch, Bash, PostgreSQL, Amazon EC2, Amazon S3 (AWS S3), Python, Redshift, SQLLead Data Engineer
2015 - 2017ModCloth- Developed and maintained an online shopping eCommerce data engineering, data analytics, 25 ETL pipelines, and a data warehouse as the only available data engineer.
- Constructed and managed Salesforce eCommerce Cloud (aka Demandware), Square POS, eCommerce replication Percona FelexCDC, Adobe Omniture Marketing Cloud, Oracle Responsys, ScientiaMobile WURFL, Qualtrics, Zodiac, ShopKeep, Acuity, and RetailNext.
- Developed data pipelines with various vendors using GitHub, Python, C/C++, Java, REST API, JSON, XML, CSV, TSV, Jira, and Slack.
- Designed Azure migration of Azure SQL Data Warehouse, Blob Storage, and Linux VM.
Technologies: Amazon Web Services (AWS), PostgreSQL, MySQL, Python, Bash, Amazon EC2, Amazon S3 (AWS S3), Redshift, SQLSoftware System Engineer
2002 - 2015Charles Schwab- Built a new portfolio accounting system on Linux as the very first engineer.
- Led the sparks team and built a cost basis accounting system and a reporting repository data warehouse.
- Built and supported Eagle Investment Systems STAR and PACE products.
- Supported and migrated the mainframe-based system to RedHat Linux/Solaris VMware server and 100TB+ scale Oracle 9/10/11/12 RAC/TAF/EMC/HDS based DataGuard/Golden Gate environments.
- Developed and supported partitioning, parallel processing, ESP scheduling, high availability/failover, disaster recovery, Tivoli monitoring, Splunk, and Zenoss.
- Implemented and supported both the development and production of OLTP, OLAP, ETL, distributed Messaging (MQ), iPlanet/Apache, Application Server, Oracle 9/10/11/12 RAC databases, and DataGuard.
- Built and supported multiple TB scale development and performance/volume/stress testing environments.
- Developed systems and applications with Java, Perl, Shell, Python, SQL, PL/SQL, and XML languages.
- Educated the team with SQL and RDBMS, MySQL/SQL Server, and a data-driven documents library.
Technologies: SQL, Bash, Perl, Red Hat Linux, Oracle, Linux