Verified Expert in Engineering
Data Engineer and Developer
Kyle has 10+ years of experience in data and machine learning engineering. He has worked at companies of various sizes, but primarily startups, collaborating with teams with almost no data infrastructure and helping them expand or update their architecture into something scalable. With a background in mathematics and engineering, Kyle is also uniquely set up to help data scientists get their projects into production in a scalable way while ensuring accuracy and effectiveness.
Amazon Web Services (AWS), Python
The most amazing...
...thing I've accomplished was a 1,000x time improvement in a machine learning model by utilizing fast Fourier transforms to replace the built-in Pandas method.
Senior Data Engineer
- Automated data ingestion from various sources with Airflow, Amazon EMR, AWS Kinesis, AWS Lambda, Python, and Snowflake.
- Rearchitected historical data pipelines to utilize more modern methods and provide proper alerting, moving from Java, Scala, and Redshift to Python, Airflow, and Snowflake.
- Managed a team of consultants to complete the automation and redesign of our CCPA data pipeline.
- Monitored, maintained, and designed data infrastructure in AWS S3, EC2, EMR, and ECR.
- Assisted in data discovery and implementation of machine learning algorithms.
Senior Software Engineer, Database
- Automated the quality testing of newly trained models using Scala and Python.
- Created a framework to launch models into a production environment with Java, Kafka, and AWS.
- Architected and implemented feedback loops to relieve third-party dependencies with Python.
- Designed and programmed tooling to give visibility into the model output using Python, AWS, and Slack.
Creative Artists Agency
- Designed and implemented ETL processes using Python, Azure Data Factory, MongoDB, and MySQL.
- Converted bulk processing systems to a streaming model using Python.
- Created various views in MySQL for data scientists and business analysts.
- Launched data science models into production and assisted in identifying and debugging errors in R.
- Developed personalized recommendations with machine learning using Flask and Python.
- Collaborated with business analysts in researching KPIs for user retention using Redshift and Python.
- Managed and monitored releases to production with Rancher, New Relic, Scalr, and AWS.
- Architected and managed tables and ETL processes in Redshift, PostgreSQL, MySQL, and Airflow.
- Analyzed data sets for relevant trends and potential to increase profit.
- Sorted users into audiences based on application usage.
- Extrapolated application and audience association based on data collected from social media.
- Improved a machine learning bidding system by enhancing runtime and click accuracy.
Senior Capstone Project
We accomplished this by utilizing a mixture of the Gaussian model to identify the rock and statistically comparing the identified cluster to other similar clusters to provide better recommendations. All of the code for this project was written in Python.
I was in charge of setting up and architecting the back end of this service, which primarily utilized Kafka and Java to ensure everything ran quickly. Our machine-learning models were deployed using Amazon SageMaker.
Automation of CCPA Deletion and Access Pipeline
Apache Airflow, CircleCI, Kafka Streams, Amazon SageMaker
Amazon Web Services (AWS), Docker, AWS Lambda, Apache Kafka, Azure
PostgreSQL, MySQL, Redshift, Amazon S3 (AWS S3), Data Pipelines, MongoDB
California Consumer Privacy Act (CCPA), Data Analysis, Data Engineering, CI/CD Pipelines, Statistical Analysis, Statistical Modeling, Machine Learning, EMR, Data Build Tool (dbt)
Bachelor's Degree in Mathematics
Harvey Mudd College - Clarmont, CA, USA