Kyle Chakos
Verified Expert in Engineering
Data Engineer and Developer
Madrid, Spain
Toptal member since February 28, 2023
Kyle has 10+ years of experience in data and machine learning engineering. He has worked at companies of various sizes, but primarily startups, collaborating with teams with almost no data infrastructure and helping them expand or update their architecture into something scalable. With a background in mathematics and engineering, Kyle is also uniquely set up to help data scientists get their projects into production in a scalable way while ensuring accuracy and effectiveness.
Portfolio
Experience
- SQL - 10 years
- Data Engineering - 10 years
- Python - 10 years
- Amazon Web Services (AWS) - 10 years
- Data Analysis - 10 years
- Data Pipelines - 10 years
- Apache Airflow - 5 years
- Snowflake - 4 years
Availability
Preferred Environment
Amazon Web Services (AWS), Python
The most amazing...
...thing I've accomplished was a 1,000x time improvement in a machine learning model by utilizing fast Fourier transforms to replace the built-in Pandas method.
Work Experience
Lead Data Engineer
RatedPower
- Wrote a comprehensive data roadmap for the team, including upgrading the current system to use more modern tools, designing more useful dashboards for the teams, and redesigning the event tracking system.
- Created a proper analytics database for the company to use.
- Managed an extension to an existing data product, extending the product reach outside of the United States and into Europe.
- Implemented data protection practices in line with GDPR.
- Processed large GIS datasets to calculate wind and solar power potential.
- Evaluated the existing data infrastructure and provided feedback about how to improve the architecture and follow best practices.
Snowflake Data Engineer
Appex Group, Inc.
- Rewrote code to handle errors more robustly while simultaneously reducing the complexity of the codebase.
- Upgraded the Airflow instances to integrate with AWS more seamlessly.
- Wrote new ingestion pipelines and worked with analysts to ensure the data suited their needs.
- Worked with their marketing team to ingest new data sources and help process new data into existing metrics.
Senior Data Engineer
Sweetgreen
- Automated data ingestion from various sources with Airflow, Amazon EMR, AWS Kinesis, AWS Lambda, Python, and Snowflake.
- Rearchitected historical data pipelines to utilize more modern methods and provide proper alerting, moving from Java, Scala, and Redshift to Python, Airflow, and Snowflake.
- Managed a team of consultants to complete the automation and redesign of our CCPA data pipeline.
- Monitored, maintained, and designed data infrastructure in AWS S3, EC2, EMR, and ECR.
- Assisted in data discovery and implementation of machine learning algorithms.
Senior Software Engineer, Database
Ticketmaster
- Automated the quality testing of newly trained models using Scala and Python.
- Created a framework to launch models into a production environment with Java, Kafka, and AWS.
- Architected and implemented feedback loops to relieve third-party dependencies with Python.
- Designed and programmed tooling to give visibility into the model output using Python, AWS, and Slack.
Data Engineer
Creative Artists Agency
- Designed and implemented ETL processes using Python, Azure Data Factory, MongoDB, and MySQL.
- Converted bulk processing systems to a streaming model using Python.
- Created various views in MySQL for data scientists and business analysts.
- Launched data science models into production and assisted in identifying and debugging errors in R.
Data Engineer
Glo
- Developed personalized recommendations with machine learning using Flask and Python.
- Collaborated with business analysts in researching KPIs for user retention using Redshift and Python.
- Managed and monitored releases to production with Rancher, New Relic, Scalr, and AWS.
- Architected and managed tables and ETL processes in Redshift, PostgreSQL, MySQL, and Airflow.
Data Engineer
UberMedia
- Analyzed data sets for relevant trends and potential to increase profit.
- Sorted users into audiences based on application usage.
- Extrapolated application and audience association based on data collected from social media.
- Improved a machine learning bidding system by enhancing runtime and click accuracy.
Experience
Senior Capstone Project
We accomplished this by utilizing a mixture of the Gaussian model to identify the rock and statistically comparing the identified cluster to other similar clusters to provide better recommendations. All of the code for this project was written in Python.
Fraud Detection
I was in charge of setting up and architecting the back end of this service, which primarily utilized Kafka and Java to ensure everything ran quickly. Our machine-learning models were deployed using Amazon SageMaker.
Automation of CCPA Deletion and Access Pipeline
Education
Bachelor's Degree in Mathematics
Harvey Mudd College - Clarmont, CA, USA
Skills
Tools
Apache Airflow, CircleCI, Kafka Streams, Amazon SageMaker, Terraform, Amazon Elastic MapReduce (EMR), Microsoft Power BI, Prefect, Jira
Languages
Python, SQL, Snowflake, Java, JavaScript, Scala, R
Paradigms
ETL, Agile, Business Intelligence (BI)
Platforms
Amazon Web Services (AWS), Docker, AWS Lambda, Apache Kafka, Azure, Databricks
Storage
PostgreSQL, MySQL, Redshift, Amazon S3 (AWS S3), Data Pipelines, Database Migration, MongoDB, Amazon DynamoDB, Amazon Aurora
Frameworks
Spark
Other
California Consumer Privacy Act (CCPA), Data Analysis, Data Engineering, Amazon RDS, CI/CD Pipelines, Amazon Redshift, Statistical Analysis, Statistical Modeling, Machine Learning, EMR, Data Build Tool (dbt), Data Warehousing, Query Optimization, Data Warehouse Design, APIs, GDPR
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring