Data Consultant2019 - PRESENT5S Technology
Technologies: Python, AWS, SQL, Data Building Tool (DBT), GitLab CI/CD, Kubernetes, Docker, Amazon EKS, Snowflake, Ubuntu, Data Science, PostgreSQL, Data Analysis, Amazon S3 (AWS S3), Amazon EC2 (Amazon Elastic Compute Cloud), ETL, AWS RDS, APIs, Data Pipelines, Analytics, Data Engineering, Kimball Methodology, Data Warehouse Design, Bash, Dimensional Modeling, Amazon Web Services (AWS)
- Developed algorithms to identify contract violations for airline unions via historical scheduling data. Translated violation decision trees to SQL queries and prototyped reroute identification model using keyword search.
- Deployed and maintained Argo workflow engine on EKS. Developed a database schema for analytics warehouse using DBT and deployed in Snowflake.
- Designed CI/CD system for Gitlab using Dockerized CLIs of pipeline tools and coached team members on usage.
Machine Learning Engineer2020 - 2021Twosense
Technologies: Python, Data Science, PostgreSQL, Data Analysis, Amazon S3 (AWS S3), Amazon EC2 (Amazon Elastic Compute Cloud), ETL, AWS RDS, Machine Learning, Data Pipelines, Analytics, Deep Learning, Data Engineering, Bash, Amazon Web Services (AWS)
- Adapted an open-source tracking library to run and collect metrics on user-level and overall model performance via simplified API. Deployed a tracking server and web application using Docker on AWS.
- Refined model deployment scripts in Python. Unified file loading in a separate module to improve code readability.
- Developed a system using Python to re-evaluate production models upon retraining, enabling the comparison of model scores using the same test data set. Conducted simulations to prove ROI on the project in terms of improved model scores.
Machine Learning Engineer2019 - 2019Simon Data
Technologies: Python, SQL, Data Science, Data Analysis, Amazon Athena, Amazon S3 (AWS S3), Amazon EC2 (Amazon Elastic Compute Cloud), ETL, AWS RDS, Machine Learning, Data Pipelines, Analytics, Data Engineering, Bash, Amazon Web Services (AWS)
- Built prototype for a client to automatically generate email segments based on product inventory, replacing a manual process that took hours per week for multiple people. Implemented a solution in the Django platform.
- Served as team lead for four data scientists. Coached team on best practices around Python testing and deployment.
- Pushed effort to simplify the manual reporting process for a client, including making SQL queries more performant and automating report delivery.
Data Scientist2017 - 2019Optoro
Technologies: Python, SQL, Data Science, PostgreSQL, Data Analysis, Amazon S3 (AWS S3), Amazon EC2 (Amazon Elastic Compute Cloud), ETL, Machine Learning, APIs, Data Pipelines, Analytics, Apache Airflow, Data Engineering, Bash, Amazon Web Services (AWS)
- Embedded in the tech product team and built models to support the core dispositioning system, aiming to achieve the highest recovery for returned and excess inventory. Deployed XGBoost models via Python APIs.
- Developed a system to monitor and retrain models using Python, SQL, and Airflow.
- Led optimization of Airflow pipelines and education around best practices for the data science team.
Senior Data Anlayst2016 - 2017Capital One Financial
Technologies: Python, SQL, Bash, Terradata, Dimensional Modeling, Luigi, Kimball Methodology, Hadoop, Data Warehouse Design, Amazon Web Services (AWS)
- Developed automated pipelines using shell scripting and Python’s Luigi library to generate Excel reports, including working with end-users to redesign reports to help them perform their tasks more efficiently.
- Created a scraper to download hundreds of files weekly from a legacy web application, which enabled my team to complete and pass an audit which we would have failed without the data.
- Served as lead analyst for the AML operations team. Researched and developed queries for identity at-risk assets and worked with stakeholders to design dashboards to track progress. Mapped legacy data with new data sources such as Salesforce.