Adel Abu Hashim, Developer in Riyadh, Riyadh Province, Saudi Arabia
Adel is available for hire
Hire Adel

Adel Abu Hashim

Bio

Adel is a senior data engineer specializing in scalable data infrastructure that drives business impact. Over the past five years, he has architected solutions for major enterprises—freeing over 250 TB of storage, reducing costs by 20%, and cutting latency by 80%. He's an expert in big data (Hadoop, Spark, Hive, Trino), cloud platforms (AWS, GCP, Azure), and ETL pipelines (Airflow, Sqoop). Adel has a proven track record: 120+ projects, 99% completion rate, top 3% on an online freelance agency.

Portfolio

STC
Teradata, Apache Hive, SQL Stored Procedures, Apache Airflow, MinIO...
Verde Valley Resources, LLC
Airtable, Python, Relational Databases, Relational Database Design, ETL, Pandas...
Paymob
Python 3, SQL, Apache Airflow, AWS Database Migration Service (DMS), Redshift...

Experience

  • Windows - 10 years
  • Python 3 - 5 years
  • SQL - 5 years
  • Redshift - 3 years
  • ETL - 3 years
  • Pandas - 3 years
  • DataHub - 2 years
  • Apache Airflow - 2 years

Preferred Environment

SQL, Python, Apache Airflow, Pandas, Spark, Apache Hive, Cloudera, ETL, Big Data

The most amazing...

...migration I architected for Jawwy freed 6 times the storage using Sqoop with zero data loss and full automation.

Work Experience

Senior Data Engineer

2024 - PRESENT
STC
  • Offloaded Jawwy ODS to Hadoop using Sqoop, implementing retry mechanisms and validation to free 6X storage while ensuring data integrity.
  • Designed a modular Teradata deletion framework, integrating historical and incremental offloading pipelines, freeing 50+ TB of storage.
  • Automated Deep Analysis phase for 100+ tables, enhancing completion from 20% to 80% through vertical profiling.
  • Integrated Trino with Teradata using TIBCO DV views, achieving an 80% reduction in query latency and preventing data duplication.
  • Analyzed C5 cluster access logs, archiving 200+ TB of cold data to MinIO to optimize storage performance.
  • Coordinated with STC on access management, efficiently handling 10+ critical requests to minimize downtime.
Technologies: Teradata, Apache Hive, SQL Stored Procedures, Apache Airflow, MinIO, Data Analysis, Big Data, Data Warehousing, Data Lakes, TIBCO, Trino, Python, Data Engineering, Data Science, Microsoft Excel, API Integration, Data Pipelines, Apache Kafka, JSON, Data & Backup Management, Azure Databricks, Data Migration, Git, Star Schema, Apache Superset, Reports, Databricks, GitHub, Amazon Redshift, Redshift, Data Mesh, Operational Data Store (ODS), Oracle, Talend, Apache Sqoop, Bash Script, Google BigQuery, Relational Databases, Amazon S3 (AWS S3), Apache Spark, Architecture, Relational Database Design, Data Marts, Data, ELT, Docker, Kubernetes, Key Performance Indicators (KPIs), CI/CD Pipelines, Database Management, Data Integration, Migration, Data Analytics, Stitch Data, Performance Optimization, PySpark, Snowflake, ETL Pipelines, CSV File Processing, Data Validation, Data Modeling, Large Data Sets, Query Optimization, Database Schema Design, Analytics

Airtable Developer

2026 - 2026
Verde Valley Resources, LLC
  • Architected an end-to-end data pipeline replacing a legacy CSV-to-Airtable workflow with a production-grade PostgreSQL system, processing 39,000+ mineral rights records with full ownership history tracking.
  • Designed a normalized schema with five core tables and six semantic views paired with a 3-stage Python ETL and hash-based deduplication, eliminating duplicate records across large-scale mineral rights datasets.
  • Built a 24-assertion test suite and adaptive Python-based ID resolution, reducing pipeline failures from real-world data inconsistencies to zero while delivering a clean Airtable sync layer for business users.
Technologies: Airtable, Python, Relational Databases, Relational Database Design, ETL, Pandas, Airtable Automations, Database Management, Data Integration, Migration, Role-based Access Control (RBAC), ETL Pipelines, CSV File Processing, Data Validation, Data Modeling, Database Schema Design

Data Engineer

2023 - 2025
Paymob
  • Migrated 5M+ records from Paxstore and Gonisight, enabling in-house geolocation and eliminating Google Maps dependency.
  • Architected ETL/Reverse ETL pipelines across 10+ projects, streamlining cross-system data flow.
  • Optimized Redshift clusters (Spark, Hive, S3, Presto), reducing infrastructure costs by 20% and doubling query speed.
  • Launched CleverTap campaigns with automated alert triggers, reaching 1,500+ POS users and decreasing support complaints by 25%.
Technologies: Python 3, SQL, Apache Airflow, AWS Database Migration Service (DMS), Redshift, ETL, Spark, Apache Hive, Amazon S3 (AWS S3), Presto, Amazon Web Services (AWS), PySpark, MDM, Cloud, Big Data, IT Service Management (ITSM), Python, Data Engineering, API Integration, Data Pipelines, Large Language Models (LLMs), JSON, Data Build Tool (dbt), Data & Backup Management, SQLite, Azure Data Factory (ADF), Azure, Microsoft Power BI, Data Migration, Business Intelligence (BI), Git, Star Schema, Dashboards, Apache Superset, Reports, Tableau, GitHub, Amazon Redshift, Data Mesh, AWS Transfer Family, Relational Databases, SOX, Product Lifecycle Management (PLM), Apache Spark, Architecture, Airtable, Relational Database Design, Data Marts, Stripe, Data, ELT, BigQuery, Docker, Kubernetes, Key Performance Indicators (KPIs), CI/CD Pipelines, Database Management, Migration, Data Analytics, Role-based Access Control (RBAC), ETL Pipelines, Microsoft Fabric, Microsoft Entra ID, CSV File Processing, Data Validation, Large Data Sets, Query Optimization, Database Schema Design, Analytics

Big Data Engineer

2022 - 2023
Etisalat Egypt
  • Centralized diverse metadata in DataHub, enhancing discoverability and integration for 100+ engineers.
  • Built Data Quality Engine with KPIs in Python, reducing data errors by 50%.
  • Automated metadata extraction from Informatica/DataStage, shortening migration time by 85% and enabling transition to Airflow/NiFi.
  • Developed scalable Python/SQL data pipelines, saving 100+ hours monthly in data preparation.
Technologies: Apache Airflow, Apache Hive, Impala, NiFi, Informatica, IBM InfoSphere (DataStage), DataHub, Python, SQL, Linux, HDFS, Spark, Pandas, NumPy, Data Engineering, Data Science, API Integration, Data Pipelines, Apache Kafka, JSON, Data & Backup Management, SQLite, Microsoft Power BI, Data Migration, Business Intelligence (BI), Git, Star Schema, Reports, Tableau, GitHub, Data Mesh, Relational Databases, Amazon S3 (AWS S3), Apache Spark, Relational Database Design, Data Marts, Data, ELT, Docker, Kubernetes, Key Performance Indicators (KPIs), Database Management, Data Integration, Migration, Big Data, Data Analytics, PySpark, ETL Pipelines, CSV File Processing, Large Data Sets, Query Optimization, Database Schema Design, Analytics

Data Engineer and Analyst

2021 - 2022
Worldie
  • Engineered distributed data systems and pipelines, reducing processing time by 90%.
  • Applied NLP and text analytics on large-scale social data, uncovering 100+ trends that informed research and strategic decisions.
  • Led data science team, implementing version control practices that enhanced collaboration and code documentation standards.
Technologies: Python, IBM Watson, Natural Language Processing (NLP), Sentiment Analysis, Text Analytics, Reporting, Research, Jupyter Notebook, Statistics, Data Cleaning, Data Processing, APIs, X (formerly Twitter) API, Data Center Infrastructure, Relational Databases, Business Analysis, Amazon S3 (AWS S3), Relational Database Design, Data Marts, Data, ELT, Key Performance Indicators (KPIs), Database Management, Data Analytics, Artificial Intelligence (AI), ETL Pipelines, Data Validation, Data Visualization

Data Engineer & Analyst

2020 - 2022
Freelancing Agency
  • Completed 120+ freelance data projects with a 100% success rate and 5-star average rating across 95+ verified client reviews on an online freelance agency, serving clients across the US, UK, Australia, Belgium, and Nigeria.
  • Designed and delivered end-to-end ETL and Reverse-ETL pipelines using Python, SQL, Apache Spark, and Hive — reducing data processing time by 2× through optimized data modeling and query tuning.
  • Built a custom data quality engine that improved data reliability by 50%, cutting pipeline failures and eliminating manual validation overhead for a high-volume analytics client.
  • Delivered 15+ complex interactive dashboards across Power BI, Tableau, and Plotly Dash—covering financial KPIs, marketing performance, sports analytics, and social media reporting for international clients.
  • Built multi-page Power BI dashboards with DAX measures, drill-through pages, cross-filtering, and custom visuals—enabling non-technical stakeholders to self-serve insights across 5+ KPI categories.
  • Developed advanced Tableau workbooks using LOD expressions, blended data sources, dynamic parameters, and calculated fields for segment-level and time-series business analysis.
  • Created production-grade Plotly Dash and Matplotlib visualization tools for EDA, infographic generation, and executive reporting — directly replacing costly 3rd-party subscriptions and saving one client $500+ a month.
  • Applied NLP and text analytics (sentiment analysis, topic modeling, entity extraction) on millions of records from YouTube, Instagram, Reddit, and Twitter to generate structured business intelligence reports.
  • Delivered deep learning solutions (CNNs, RNNs) for classification and prediction tasks, with hands-on AWS SageMaker deployment—bridging data engineering pipelines directly into machine learning (ML) model serving.
  • Maintained a 95% on-time, 95% on-budget delivery rate across all engagements, earning the Preferred Freelancer badge and Verified status—reflecting the same reliability and standards upheld within Toptal's Top 3% network.
Technologies: Python, SQL, Apache Spark, Apache Hive, ETL, Reverse-ETL, Data Pipelines, Data Architecture, Data Modeling, Data Quality, Microsoft Power BI, Tableau, Matplotlib, Plotly, Dash, Looker Studio, Machine Learning, Deep Learning, Natural Language Processing (NLP), PostgreSQL, MongoDB, Oracle, Apache Sqoop, Artificial Intelligence (AI), Data Visualization, Pytest

Physics and Mathematics Teacher

2015 - 2019
Town School
  • Helped students achieve high grades through structured lessons and problem-solving techniques.
  • Developed personalized learning strategies that significantly improved students' understanding and performance.
  • Taught over 100 students in mathematics and physics, covering topics from basic to advanced levels.
Technologies: Mathematics, Physics

Experience

Data Warehouse for Sparkify

https://github.com/adelabuhashim/DataWarehouse
A music streaming startup, Sparkify, had grown its user base and song database and wanted to move its processes and data onto the cloud. Its data resided in Amazon S3, a directory of JSON logs on user activity within the app, and a directory with JSON metadata on the songs in the app.

As a data engineer, this project involved building an ETL pipeline that extracts data from Amazon S3, stages it in Redshift, and transforms data into a set of dimensional tables for analytics team to continue finding insights into what songs their users are listening to. The project also involved testing the database and ETL pipeline by running SQL queries.

Education

2015 - 2020

Bachelor's Degree in Computer Engineering

Zagazig University - Zagazig, Egypt

Certifications

JANUARY 2026 - PRESENT

Certified Data Management Professional (CDMP)

DAMA International®

NOVEMBER 2025 - NOVEMBER 2028

AWS Certified Data Engineer

Amazon Web Services

JANUARY 2025 - PRESENT

Astronomer Certification for Apache Airflow Fundamentals

Astronomer

JANUARY 2025 - PRESENT

Astronomer Certification DAG Authoring for Apache Airflow

Astronomer

NOVEMBER 2023 - PRESENT

Data Architect Nanodegree

Udacity

MARCH 2023 - PRESENT

Data Engineering Nanodegree

Udacity

MARCH 2023 - PRESENT

Data Engineering Nanodegree

Udacity

MAY 2020 - PRESENT

Data Analyst Nanodegree

Udacity

Skills

Libraries/APIs

Pandas, PySpark, NumPy, REST APIs, Matplotlib, X (formerly Twitter) API, Stripe

Tools

Apache Airflow, DataHub, PyCharm, Terminal, AWS Glue, Microsoft Excel, Tableau, GitHub, AWS Transfer Family, Amazon Elastic MapReduce (EMR), Impala, Cloudera, Microsoft Power BI, Git, Microsoft Access, Looker, BigQuery, Stitch Data, Pytest, AWS Command Line Interface (CLI), AWS IAM, IBM InfoSphere (DataStage), Plotly, Apache Sqoop, IBM Watson, Amazon Kinesis Data Firehose, Amazon CloudWatch, Amazon EKS, Amazon Redshift Spectrum, AWS Step Functions, Amazon Athena, Amazon QuickSight, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), AWS DataSync

Languages

Python 3, SQL, Python, Snowflake, HTML, Bash Script

Paradigms

ETL, Business Intelligence (BI), Role-based Access Control (RBAC), Dimensional Modeling, OLAP

Platforms

Windows, Visual Studio Code (VS Code), Amazon Web Services (AWS), Azure, Databricks, Talend, Linux, Apache Kafka, Docker, MacOS, AWS IoT, Jupyter Notebook, Oracle, Kubernetes, Microsoft Fabric

Storage

Redshift, PostgreSQL, Amazon S3 (AWS S3), Data Pipelines, JSON, Relational Databases, Database Management, Data Integration, Databases, Apache Hive, HDFS, Amazon DynamoDB, SQLite, Distributed Databases, Data Validation, OLTP, Data Lakes, Teradata, SQL Stored Procedures, Amazon Aurora, NoSQL, Operational Data Store (ODS), Elasticsearch, Microsoft SQL Server, Master Data Management (MDM), MongoDB, Microsoft Entra ID

Frameworks

Spark, Hadoop, Apache Spark, Presto, Trino

Other

Data Modeling, Data Warehousing, Big Data, Cloud, Data Engineering, API Integration, Data Build Tool (dbt), Dashboards, Amazon Redshift, Data Mesh, Google BigQuery, Airtable, Relational Database Design, Amazon Managed Workflows for Apache Airflow (MWAA), Data, ELT, Migration, Data Analytics, Artificial Intelligence (AI), Performance Optimization, ETL Pipelines, CSV File Processing, Large Data Sets, Database Schema Design, Analytics, AWS Database Migration Service (DMS), Data Governance, Data Transformation, MDM, System Administration, Data Migration, Data & Backup Management, Data Visualization, Azure Data Factory (ADF), Azure Databricks, Star Schema, Apache Superset, Reports, Data Cleaning, Architecture, Data Marts, Key Performance Indicators (KPIs), Airtable Automations, Medallion Architecture, Query Optimization, Software Engineering, Computer Networking, Computer Engineering, Designing Data Systems, Apache Cassandra, Normalization, Data Lineage, Data Architecture, MinIO, Data Analysis, Mathematics, Physics, IT Service Management (ITSM), TIBCO, NiFi, Informatica, Data Migration Testing, Data Science, Large Language Models (LLMs), DAG Authoring, Scheduling, Orchestration, Directed Acrylic Graphs (DAG), Data Management, Statistics, Data Wrangling, Natural Language Processing (NLP), Sentiment Analysis, Text Analytics, Reporting, Research, Data Processing, APIs, Amazon Kinesis, Lambda Functions, Amazon RDS, Amazon MSK, Amazon EventBridge, Amazon AppFlow, Amazon MemoryDB for Redis, Data Center Infrastructure, SOX, Business Analysis, Product Lifecycle Management (PLM), Data Quality, Metadata, Data Strategy, Big Data Architecture, CI/CD Pipelines, Reverse-ETL, Dash, Looker Studio, Machine Learning, Deep Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring