Data Engineer2019 - PRESENTCisco
Technologies: Python, PySpark, Sparks SQL, Hadoop, Hive, JSON, Google Cloud Platform. GitHub, Tidal, JIRA
- Supported machine learning, AI, and sales campaign automation.
- Built data pipeline framework, guidelines, production procedures, data architecture, and code review process.
- Led junior Python and PySpark developers.
Big Data Engineer2017 - 2018Western Digital
Technologies: AWS, Redshift, Python, S3, EC2, Postgres, Bash, Elastic Search
- Develop and support Enterprise Data Management Big Data Engineering, world-wide head and drive wafer fab production image and data ETL pipelines.
- Rebuild, manage and tune large production Enterprise Data Management AWS Redshift clusters to allow large volume pipelines and user queries.
- Support AWS Redshift, Redshift Spectrum, ElasticSearch, Kinesis, S3, EC2, RDS, MySQL, PostgreSQL, Aurora, CloudWatch. Manage Control-M, Spotfire, SnapLogic ETL.
- Support wafer images defect model Machine Learning platform.
- Tooled Slack, Hadoop, Hive, Impala, Python, numpy, scipy, SVM, SVD, GitHib, BitBucket, Jenkins, Tidal, Java, JIRA, wiki, Confluence.
Lead Data Engineer2015 - 2017ModCloth
Technologies: AWS, Redshift, S3, EC2, Bash, Python, MySQL, Postgres
- Developed and maintained an online shopping eCommerce data engineering, data analytics, 25 ETL pipelines, and data warehouse as the only available data engineer.
- Constructed and managed Salesforce E-commerce Cloud (aka DemandWare), Square POS, E-commerce replication Percona FelexCDC, Adobe Omniture Marketing Cloud, Oracle Responsys, ScientiaMobile WURFL, Qualtrics, Zodiac, ShopKeep, Acuity, and RetailNext.
- Developed data pipelines with various vendors using GitHub, Python, C/C++, JAVA, REST API, JSON, XML, CSV, TSV, JIRA, Slack.
- Designed Azure migration of Azure SQL Data Warehouse, Blob Storage, and Linux VM.
Software System Engineer2002 - 2015Charles Schwab
Technologies: Linux, Oracle, RedHat, Perl, Bash, SQL
- Built new portfolio accounting system on Linux as the very first engineer.
- Led SPARKS team and built Cost Basis Accounting System, Reporting Repository Data Warehouse.
- Built and supported Eagle Investment Systems STAR and PACE products.
- Supported and migrated mainframe based system to RedHat Linux/Solaris VMware server and 100TB+ scale Oracle 9/10/11/12 RAC/TAF/EMC/HDS based DataGuard/Golden Gate environments.
- Developed and supported partitioning, parallel processing, ESP scheduling, high availability/failover, disaster recovery, Tivoli monitoring, Splunk, and Zenoss.
- Implemented and supported both development and production OLTP, OLAP, ETL, distributed Messaging (MQ), iPlanet/Apache, Application Server, Oracle 9/10/11/12 RAC databases, and DataGuard.
- Built and supported multiple TB scale development and performance/Volume/Stress testing environments.
- Developed systems and applications with JAVA, Perl, Shell, Python, SQL, PL/SQL, and XML languages.
- Educated team with SQL and RDBMS, MySQL/SQL Server, and Data Driven Documents library.