Festus Asare Yeboah
Verified Expert in Engineering
Data Engineer and Developer
Plano, TX, United States
Toptal member since May 14, 2020
Festus is a data and machine learning engineer with in-depth, hands-on technical expertise in data pipeline architecture. He excels at the design and implementation of big data technologies (Spark, Kafka, data lakes) and has a proven track record in consulting on architecture design and implementation.
Portfolio
Experience
- SQL - 6 years
- Data Warehouse Design - 5 years
- Azure - 4 years
- Python 3 - 4 years
- Data Engineering - 3 years
- Databricks - 3 years
- Machine Learning - 3 years
- Azure Data Factory - 2 years
Availability
Preferred Environment
Data Lakes, Data Warehouse Design, Data Warehousing, Machine Learning, Spark
The most amazing...
...thing I've built is a data engineering pipeline that streams data from an IoT device like a bag scanner in an airport to a data lake.
Work Experience
ML/Data Engineer
- Helped customers migrate their data pipelines from on-prem to the Google Cloud Platform.
- Migrated ETL pipelines from AWS and Azure to Google Cloud.
- Collaborated with Data Scientists to develop Machine Learning Operations based on trained models.
Data/ML Engineer
Databricks
- Developed an app to store and track changes in the hyperparameters used in training models and the data utilized to train the models. This application saves model metadata and provides access to them using API calls.
- Built an optical character recognition pipeline that converted images to a table.
- Increased querying performance of a 75TB data lake table. The reports pulled from this table had an SLA of 30 seconds. By applying Spark performance tuning techniques, I decreased the query time to less than five seconds.
Senior Data Engineer
Copart
- Developed a real-time data pipeline to move application logs to a more consumable form for reporting.
- Built a global data warehouse to serve as a single source of truth for company-wide open operational metrics.
- Migrated the company's ETL architecture to the cloud.
Software Developer
Brocks Solution
- Developed a real-time data pipeline to stream data from IoT devices (bag tag scanners) at airports to create baggage handling reports for business executives.
- Led the implementation of analytics into the company's enterprise baggage handling system. software.
- Created dashboards to report data on baggage handling operations.
Experience
Pipeline Medical Records into a Scalable Data Store
Using AWS Kinesis, Lambda, Airflow, and data bricks, I was able to rearchitect their pipeline to a simpler, scalable one. The pipeline improved from running in 30 minutes to running in two minutes.
Optimize Data Reads from a 75TB Data Lake
Meta Store for ML Model Training
I developed a library that saved all the model metadata to a data store and made it accessible through an API endpoint.
Education
Master's Degree in Machine Learning
Southern Methodist University - Dallas, TX, USA
Bachelor's Degree in Aerospace Engineering
Kwame Nkrumah University of Science and Technology - Kumasi, Ghana
Certifications
Spark Certification
Databricks
Skills
Tools
Amazon Elastic MapReduce (EMR), Apache Airflow, BigQuery
Languages
SQL, Python 3, Python, Scala
Frameworks
Spark
Platforms
Databricks, Apache Kafka, Azure, Azure Event Hubs, Pentaho, Amazon Web Services (AWS), Google Cloud Platform (GCP)
Paradigms
ETL
Storage
Data Lakes, DataWare
Other
Data Warehouse Design, Machine Learning, Data Engineering, Azure Data Factory, Lambda Functions, Data Warehousing, Google BigQuery
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring