ML/Data Engineer2020 - PRESENT
Technologies: Spark, Google Cloud Platform (GCP), Databricks, Google BigQuery
- Helped customers migrate their data pipelines from on-prem to the Google Cloud Platform.
- Migrated ETL pipelines from AWS and Azure to Google Cloud.
- Collaborated with Data Scientists to develop Machine Learning Operations based on trained models.
Data/ML Engineer2019 - 2020Databricks
Technologies: Azure, Scala, Python, AWS, SQL, Spark
- Developed an app to store and track changes in the hyperparameters used in training models and the data utilized to train the models. This application saves model metadata and provides access to them using API calls.
- Built an optical character recognition pipeline that converted images to a table.
- Increased querying performance of a 75TB data lake table. The reports pulled from this table had an SLA of 30 seconds. By applying Spark performance tuning techniques, I decreased the query time to less than five seconds.
Senior Data Engineer2017 - 2018Copart
Technologies: Azure, Apache Kafka, Pentaho, Python, SQL
- Developed a real-time data pipeline to move application logs to a more consumable form for reporting.
- Built a global data warehouse to serve as a single source of truth for company-wide open operational metrics.
- Migrated the company's ETL architecture to the cloud.
Software Developer2015 - 2018Brocks Solution
Technologies: Azure, DataWare, SQL, Python
- Developed a real-time data pipeline to stream data from IoT devices (bag tag scanners) at airports to create baggage handling reports for business executives.
- Led the implementation of analytics into the company's enterprise baggage handling system. software.
- Created dashboards to report data on baggage handling operations.