Senior Data Engineer2020 - 2021Netlify
Technologies: Data Building Tool (DBT), Spark, Databricks, SQL, Python, Scala, Ansible, Terraform, Big Data, AWS, Redshift, BigQuery, Google Cloud Platform (GCP), DigitalOcean, ETL, ELT, Apache Airflow, Fivetran, PySpark, Business Intelligence (BI), Amazon Web Services (AWS), Data Migration, Salesforce API, Salesforce
- Maintained business-critical data pipelines which handled 2TB of data a day with over 60 different data pipelines.
- Migrated the company's data process from ETL to ELT using DBT and Spark on Databricks. This simplified the number of programming languages used from four to two and allowed for easier debugging and idempotent data pipelines.
- Introduced and productionized the use of Apache Airflow for data workflow orchestration and to kick off extractions and loads that previously were extremely tedious and difficult to monitor.
- Switched from error-prone data extractions for some data sources over to Fivetran. After Fivetran was set up, introduced several new datasets, and integrated them into our data warehouse for analytical uses.
- Authored and maintained big data compute jobs by leveraging PySpark and SparkSQL on Databricks and building out a data lakehouse in the process.
- Secured data warehouse clusters by automating the creation of credentials through Ansible and Terraform.
- Built a cost-of-goods data model that leveraged source billing data from major cloud providers into one conformed model in order to answer questions about the company's cloud spending.
Data Engineer Curriculum Writer (Subject Matter Expert)2020 - 2020Thinkful
Technologies: Apache Airflow, Linux, Redshift, SQL, Bash, Python, Data Modeling, Amazon Web Services (AWS)
- Designed a 6-month curriculum plan for teaching data engineers.
- Wrote core modules and led meetings to brainstorm the best topics to cover.
- Created hands-on labs for students to go through to gain practical experience in the role of a data engineer.
Senior Data Engineer Contractor2020 - 2020Offer Up
Technologies: Apache Airflow, BigQuery, Snowflake, AWS S3, Google Cloud Storage, Data Building Tool (DBT), Python, SQL, Fivetran, Amazon Web Services (AWS), Data Migration
- Migrated dozens of data pipelines from Apache Airflow to Google Cloud Composer in a very short time span.
- Assisted in a mass data warehouse migration from Snowflake to BigQuery for a total of 1PB (petabyte) of data being migrated between AWS and Google Cloud.
- Reconciled data records in old and new data warehouses to ensure that the data pipelines worked as expected.
- Rewrote analytical SQL queries that were Snowflake-specific to use BigQuery's syntax which involved lateral flattens and other types of complicated joins on nested data and semi-structured data.
Data Engineer2019 - 2020Brushfire
Technologies: Redshift, AWS, Azure, Kubernetes, SQL Server DBA, SQL, C#, .NET, .NET Core, Azure DevOps, Data Pipelines, AWS Data Pipeline Service, Stripe, ETL, CI/CD Pipelines, Azure Kubernetes Service (AKS), Tableau, Python, Azure DevOps Services, Amazon Web Services (AWS), Web Scraping, T-SQL, Data Migration
- Built out a new data warehouse in AWS for analytical and data mining workloads (the first data warehouse of the company).
- Implemented business intelligence and data visualizations for the company and client KPIs.
- Built repeatable, scalable ETL data pipelines in the cloud.
- Identified data performance issues in queries, indexes, data modeling, and so on.
- Migrated from .NET to .NET Core, then transitioned the entire web stack from Azure VMs to Kubernetes (complete with cluster and pod autoscaling and CI/CD with Azure DevOps).
Principal Big Data Consultant2014 - 2019zData, Inc
Technologies: Hadoop, Spark, Apache ZooKeeper, Apache Hive, HBase, Apache Ambari, Data Lakes, Big Data Architecture, Data Lake Design, Greenplum, Redshift, Apache Kafka, AWS EC2, Linux, Ansible, Bash, AWS CloudFormation, Terraform, AWS S3, Hortonworks Data Platform (HDP), Cloudera, EMR, Autoscaling, Couchbase, Amazon SQS, AWS DynamoDB, Kerberos, Java, Python, Elixir, MySQL, Amazon Web Services (AWS), Web Scraping, Data Migration, Salesforce API
- Architected big data and cloud solutions in a cloud-based enterprise Linux environment utilizing compute resources such as EC2, S3, AWS Auto Scaling, Lamba functions, DynamoDB, Couchbase, SQS, and Elastic MapReduce (EMR).
- Automated repeatable deployments of big data software using Ansible, AWS CloudFormation, and Terraform.
- Secured distributed clusters via security groups, firewalls, authorization and authentication policies. Kerberized Hadoop and other distributed clusters for strong authentication.
- Led various software development projects in back-end web APIs and cloud-based web applications in Java, Python, and Elixir.
- Architected distributed, fault-tolerant, and highly available systems using on-premise or cloud-based hardware.
- Built a custom Apache Ambari stack containing installation and management capabilities for Pivotal Greenplum, Pivotal HAWQ, and Chorus in Python.
- Developed a parallel backup and restore solution for large compute clusters in the AWS cloud in Java.
DevOps Engineer2014 - 2014Melaleuca
- Built an internal tool to dynamically discover WCF endpoints and make async requests to see if they are online or not. The tool was used to warm up the services to trigger JIT compilation and to actively monitor which endpoints were down.
- Wrote a custom DevOps dashboard that the entire team used to monitor how deployments were going and the health of the cloud infrastructure.
- Automated several complicated workflows in SharePoint to reduce the number of repetitive tasks for the team.
Firmware Engineer | Web Developer | Network Operations Manager | Senior Support Technician2007 - 2013Linora
- Wrote the back-end and front-end code for router control panels and for "MeshView" which was a cloud-based (before the cloud) management portal for WiFi networks.
- Managed a small team of network operators and support technicians to take inbound level I and level II calls; was responsible for hiring, firing, teaching, and scheduling the team.
- Wrote firmware for WiFiRanger and BlueMesh Networks product firmware; worked with kernel modules, internal and external radios, USB modem connectivity, controlling remote radios over ethernet, failover, and failback logic.
- Assisted in the management, monitoring, and software for monitoring over 10,000 IoT and routers in the field.