Amr Saleh, Developer in Auckland, New Zealand
Amr is available for hire
Hire Amr

Amr Saleh

Verified Expert  in Engineering

Data Engineer and Developer

Location
Auckland, New Zealand
Toptal Member Since
October 7, 2021

Amr is an expert in data architecture and engineering with over 12 years of global expertise in cloud-centric data solutions. He has collaborated with top-tier entities worldwide in fintech, healthcare, banking, and telecom industries. He is skilled in AWS stack, snowflake, NoSQL databases, SQL, Python, PySpark, Power BI, QuickSight, Hadoop, NiFi, PowerApps, MongoDB, DynamoDB, and more. He holds an MSc in data science and offers tailored data engineering courses, empowering global enterprises.

Portfolio

GrayHawk Health Inc.
SQL, Python, APIs, Microsoft Power Apps, Microsoft Power BI, CRM APIs, SAP CRM...
Modus Operadi
Amazon DynamoDB, Data Warehousing, Business Intelligence (BI), ETL...
Ricoh Corporation - Intelligent Business Platform
SQL, NoSQL, Data Engineering, Data Architecture, Amazon Web Services (AWS)...

Experience

Availability

Part-time

Preferred Environment

SQL, Amazon Web Services (AWS), Snowflake, Microsoft Power BI, Big Data, Databases, Python, Azure, Amazon DynamoDB, MongoDB, Monte Carlo

The most amazing...

...thing I've done was join Ricoh in the US as senior data architect and improved their data pipelines by reducing 92% of the resources and time needed.

Work Experience

Senior Database & CRM Developer

2023 - 2023
GrayHawk Health Inc.
  • Built CRM for the company on PowerApps to automate the manual processes and enable the sales and operations to run smoothly on a robust CRM implementation.
  • Offered 24/7 assistance for staff on CRM and data-related issues and offered constant improvements to the platform by building new forms and automations.
  • Advised the business on building its first data warehouse.
  • Assisted in choosing the right vendor to build the company's CRM and DWH, based on cost and SOW, and assisted in writing the SOW for the CRM and DWH Vendors.
  • Managed the DWH and CRM vendors during the planning and implementation phases.
Technologies: SQL, Python, APIs, Microsoft Power Apps, Microsoft Power BI, CRM APIs, SAP CRM, Healthcare IT, Power Query, Azure, BI Reporting, Snowflake, Data Engineering, Data Analysis, ETL, Data Architecture, SQL Server DBA, Cloud Infrastructure, Spark SQL, Hortonworks Data Platform (HDP), MySQL DBA, Big Data Architecture, Database Architecture, Azure SQL, CI/CD Pipelines, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, ADF, MacPractice, OpenAPI, Reporting, Analytics, Infrastructure, Cloud Storage, Directed Acrylic Graphs (DAG), Data Quality Analysis, Complex Data Analysis, PostgreSQL, Data Cleansing, Data Reporting, Data Governance, Data Visualization, Revenue & Expense Projections, Pipelines, Microsoft Excel, Google Cloud Storage, Object-oriented Programming (OOP), Database Design, Software, NoSQL, Web Scraping, Azure Data Lake, Cloud Platforms, Distributed Systems, Systems Monitoring, Excel VBA, Jira, Entity Relationships, Microsoft Azure, SQL Stored Procedures, Microsoft Flow, Data Feeds, Data Extraction, Delta Lake, Apache Hive, Cloud, Key Performance Indicators (KPIs), Data Management, SSH, GitHub, Virtual Machines, Azure Virtual Machines, Web Dashboards, Auth0, Business Requirements, C#, Data Transformation, Distributed Databases, Real-time Data, Data Quality, English

Senior DW & BI Developer

2023 - 2023
Modus Operadi
  • Automated all CEO and executive team reports to minimize the reliance on manual report running and eliminate human errors while scheduling the reports to run at specific times each day, fitting with the executive teams' requirements.
  • Built data warehouse from scratch, starting from the architecture.
  • Architected specific pipelines to flatten the data from non-structured data in MongoDB to structured data in the RedShift data warehouse as part of their new data warehouse.
Technologies: Amazon DynamoDB, Data Warehousing, Business Intelligence (BI), ETL, Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon Athena, AWS Glue, Amazon QuickSight, Microsoft Power BI, Amazon CloudWatch, Amazon EC2, MongoDB, AWS Lambda, Data Architecture, IT Automation, Testing, BI Reporting, Data Engineering, Data Analysis, SQL Server DBA, Cloud Infrastructure, Spark SQL, Big Data Architecture, Azure SQL, CI/CD Pipelines, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, ADF, OpenAPI, Reporting, Analytics, Infrastructure, Cloud Storage, Financial Data Analytics, Complex Data Analysis, PostgreSQL, Data Cleansing, Data Reporting, Data Visualization, Financial Modeling, Pipelines, Microsoft Excel, Database Design, Software, NoSQL, Azure Data Lake, Cloud Platforms, Distributed Systems, Systems Monitoring, ELK (Elastic Stack), Visual Basic for Applications (VBA), Excel VBA, Jira, Entity Relationships, Microsoft Azure, SQL Stored Procedures, Data Feeds, Data Extraction, Delta Lake, Apache Hive, Cloud, Key Performance Indicators (KPIs), Data Management, SSH, GitHub, AWS CLI, Microsoft Access, Virtual Machines, Web Dashboards, Auth0, Data Transformation, Distributed Databases, Real-time Data, Data Quality, English

Senior Data Architect and Engineer

2023 - 2023
Ricoh Corporation - Intelligent Business Platform
  • Automated Centene reports and data pipelines saving +90% run-time and resources and eliminating human errors.
  • Worked on building the microservices platform and completed +2.5x more tasks than the other developers assigned to the same project.
  • Proposed critical database schema modifications were implemented to cover all UX scenarios.
  • Advised Ricoh IBP on scaling their platform efficiently and assisted with the architecture upgrade to microservices from monolith architecture.
  • Automated their manual reports to run through AWS pipelines and automatically send Excel export to the clients' email addresses, saving lots of time on manual reporting.
  • Optimized the most used SQL queries and improved the overall performance of the database.
  • Built efficient Lambda functions for data ingestion, transformation, reporting, and migration following industry standards.
  • Developed back-end code for microservices architecture by building Lambda functions, APIs, and stored procedures on MySQL and MS SQL databases.
Technologies: SQL, NoSQL, Data Engineering, Data Architecture, Amazon Web Services (AWS), MySQL, Amazon Athena, Amazon S3 (AWS S3), AWS Lambda, Lambda Architecture, Lambda Functions, Redshift, Redshift Spectrum, MuleSoft, Databases, Data Lakes, Data Warehouse Design, Architecture, Data Integration, Microservices, Amazon DynamoDB, Pandas, Excel Macros, AWS IAM, API Integration, IT Automation, Technical Documentation, REST APIs, Writing & Editing, Documentation, Database Performance, Microsoft Power Apps, Microsoft Power Automate, Docker, Linux, Database Replication, Data Recovery, Exploratory Data Analysis, System Administration, AIX, Windows, High Availability Disaster Recovery (HADR), Java, PostgREST, Azure Databricks, Azure Synapse, Back-end Development, AWS CloudFormation, Amazon Virtual Private Cloud (VPC), Azure Data Factory, CRM APIs, RESTful Microservices, Visualization, NumPy, BI Reporting, Snowflake, Microsoft Power BI, Data Analysis, ETL, SQL Server DBA, Cloud Infrastructure, DevOps, Big Data Architecture, SAP CRM, OpenAPI, .NET, Reporting, Analytics, Infrastructure, Cloud Storage, Data Quality Analysis, Complex Data Analysis, PostgreSQL, Data Cleansing, Data Reporting, Data Visualization, Pipelines, Microsoft Excel, Database Design, Artificial Intelligence (AI), Cloud Platforms, Distributed Systems, Visual Basic for Applications (VBA), Excel VBA, Jira, Entity Relationships, Microsoft Azure, SQL Stored Procedures, Azure Cosmos DB, Microsoft Flow, Data Feeds, Data Extraction, Delta Lake, Apache Hive, Cloud, Key Performance Indicators (KPIs), Data Management, SSH, GitHub, AWS CLI, Virtual Machines, Azure Virtual Machines, Business Requirements, Data Transformation, Distributed Databases, Data Scraping, Data Quality, English

Data Expert

2022 - 2023
Broad Solutions
  • Took over the system with no documentation or handover, managed to understand and document the whole architecture with all running processes and their schedule.
  • Architected system backup for processes, schedules, and data for disaster recovery.
  • Offered 24/7 swift support for data and pipeline issues in Snowflake and APIs.
Technologies: SQL, Azure, Azure Data Factory, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Snowflake, Data Pipelines, CI/CD Pipelines, Java, Python, ADF, Reporting, Analytics, Infrastructure, Cloud Storage, Data Quality Analysis, Complex Data Analysis, PostgreSQL, Data Cleansing, Data Reporting, Data Governance, Forecasting, Pipelines, Google Sheets, Object-oriented Programming (OOP), Database Design, NoSQL, Web Scraping, Azure Data Lake, Artificial Intelligence (AI), Cloud Platforms, Distributed Systems, Systems Monitoring, Visual Basic for Applications (VBA), Excel VBA, Jira, Entity Relationships, Microsoft Azure, SQL Stored Procedures, Data Feeds, Delta Lake, Apache Hive, Cloud, Data Management, SSH, GitHub, AWS CLI, Microsoft Access, Virtual Machines, Azure Virtual Machines, Apache Flink, Web Dashboards, Go, Business Requirements, Data Transformation, Distributed Databases, Real-time Data, Data Quality, Data Mining, English

DT Technology

2022 - 2023
Lyticshub
  • Put together and led a team of software developers and data engineers and trained them with the latest cloud technology skills.
  • Built the data architecture for multiple clients and data pipelines on AWS and Azure.
  • Suggested architectural changes that saved 40% of the cloud cost for one of our clients.
Technologies: Data Engineering, Data Architecture, Leadership, Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), SQL, Python, Amazon Athena, MongoDB, MongoDB Atlas, Relational Databases, Snowflake, AWS Glue, Data Migration, Amazon RDS, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), Performance Tuning, MongoDB Compass, Web Scraping, Database Administration (DBA), Excel 365, Technical Writing, Data Warehouse Design, Google BigQuery, Looker, Data Build Tool (dbt), Architecture, Data Integration, Apache Airflow, Amazon QuickSight, Amazon DynamoDB, Pandas, eCommerce, AWS IAM, Amazon CloudWatch, AWS Data Pipeline Service, API Integration, SQL Server Integration Services (SSIS), IT Automation, Technical Documentation, REST APIs, Writing & Editing, Documentation, Database Performance, Microsoft Power Apps, Microsoft Power Automate, DAX, Power Query, MariaDB, Docker, Linux, Database Replication, IBM Cognos, Exploratory Data Analysis, IBM Db2, IBM Tivoli Storage Manager, System Administration, AIX, Windows, Apache Spark, Apache Maven, Scala, Java, Amazon Elastic MapReduce (EMR), MapReduce, Azure Databricks, Azure Synapse, Geospatial Data, Back-end Development, AWS CloudFormation, BigQuery, HubSpot, Wix, CRM APIs, Data Science, Bloomberg, Visualization, Domo, NumPy, Data Scientist, OCR, BI Reporting, Microsoft Power BI, Data Analysis, Predictive Analytics, Deep Learning, Machine Learning, ETL, SQL Server DBA, Cloud Infrastructure, Teradata, Spark SQL, Hortonworks Data Platform (HDP), DevOps, MySQL DBA, Big Data Architecture, Database Architecture, Azure SQL, CI/CD Pipelines, ADF, Healthcare IT, SAP CRM, WordPress, R, Reporting, Analytics, Infrastructure, Cloud Storage, AWS IoT, GAAP, Data Quality Analysis, Complex Data Analysis, PostgreSQL, PySpark, Data Cleansing, Data Reporting, Data Governance, Data Visualization, Revenue & Expense Projections, Financial Modeling, Forecasting, Pipelines, Microsoft Excel, Google Sheets, Oracle DBA, Database Design, Software, NoSQL, Microsoft Fabric, Azure Data Lake, Google Bigtable, Artificial Intelligence (AI), Cloud Platforms, Distributed Systems, Systems Monitoring, Kibana, Entity Relationships, Microsoft Azure, Kubernetes, SQL Stored Procedures, Azure Cosmos DB, Microsoft Flow, Delta Lake, Cloud, Data Management, GitHub, Virtual Machines, Azure Virtual Machines, Apache Flink, Web3, Metabase, Web Dashboards, Business Requirements, Data Transformation, Distributed Databases, Data Mining, English

Senior Data Engineer

2022 - 2023
PropertyRadar
  • Developed a process in Python to ingest, transform, and clean people information for concise marketing targeting. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
  • Developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
  • Implemented cloud migration for people process, which reduced the cost and improved the speed and now runs on AWS instead of MySQL.
Technologies: ETL, Data Lakes, Amazon Web Services (AWS), Apache Kafka, Python, Python 3, Spark, Amazon S3 (AWS S3), Data Engineering, Streaming, Databricks, Amazon Athena, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, SQL DML, Performance Tuning, SQL, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Data Architecture, Data Warehouse Design, Architecture, Data Integration, Amazon QuickSight, Pandas, AWS IAM, API Integration, Real Estate, IT Automation, Technical Documentation, Documentation, Database Performance, Power Query, Linux, Exploratory Data Analysis, System Administration, High Availability Disaster Recovery (HADR), AWS CloudFormation, Amazon Virtual Private Cloud (VPC), Azure Data Factory, Visualization, NumPy, Snowflake, Microsoft Power BI, Data Analysis, Cloud Infrastructure, Big Data Architecture, Azure SQL, CI/CD Pipelines, Reporting, Analytics, Infrastructure, Cloud Storage, PostgreSQL, Data Cleansing, Data Reporting, Data Governance, Data Visualization, Pipelines, Object-oriented Programming (OOP), Database Design, NoSQL, Cloud Platforms, Microsoft Azure, Kubernetes, SQL Stored Procedures, Data Feeds, Data Extraction, Delta Lake, Cloud, Data Management, SSH, AWS CLI, Microsoft Access, Business Requirements, Data Transformation, Data Scraping, Data Quality, Property Management, Property Management System Integrations, English

Lead Trainer

2019 - 2023
Sprints
  • Trained over ten cohorts of professionals to enter the data engineering market.
  • Helped the Telecom Egypt technical team increase their data-related capabilities.
  • Led the team to design and deliver the curriculum for data engineering.
Technologies: Amazon Web Services (AWS), Hadoop, SQL, Snowflake, Data Warehousing, Hortonworks Data Platform (HDP), Redshift, Google Analytics, Azure SQL, Data Analytics, AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Amazon Athena, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), Microsoft SQL Server, SQL DML, Amazon Neptune, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Data Architecture, Excel 365, Data Warehouse Design, Google BigQuery, Looker, Architecture, Data Integration, Microservices, Apache Airflow, Pandas, Tableau, AWS IAM, SQL Server Integration Services (SSIS), IT Automation, REST APIs, Documentation, DAX, Linux, Exploratory Data Analysis, IBM Db2, System Administration, Windows, Apache Spark, Scala, Java, Amazon Elastic MapReduce (EMR), MapReduce, Azure Data Factory, Google Cloud Platform (GCP), Data Science, Visualization, Predictive Analytics, Deep Learning, SQL Server DBA, Cloud Infrastructure, Informatica, DevOps, Big Data Architecture, Reporting, Analytics, Infrastructure, Cloud Storage, Excel 2016, Financial Data Analytics, PostgreSQL, Data Cleansing, Data Reporting, Data Governance, Financial Modeling, Pipelines, Google Sheets, Google Cloud Storage, Database Design, NoSQL, SQL Stored Procedures, Data Extraction, Cloud, Data Management, SSH, Data Mining, English

Data Lead

2022 - 2022
Accident Compensation Corporation
  • Developed a data model for migrating an old CRM to Salesforce.
  • Built and validated data pipelines for data migration and built a data dictionary.
  • Built and automated data validation on the new CRM.
Technologies: Data Modeling, Snowflake, Amazon Web Services (AWS), SQL, Python, Testing, Relational Databases, AWS Glue, Data Migration, Salesforce, Database Optimization, Database Migration, Data, Performance Tuning, Database Administration (DBA), Data-driven Dashboards, Data Architecture, Technical Writing, Data Warehouse Design, Architecture, Data Integration, Pandas, AWS IAM, IT Automation, Technical Documentation, Documentation, Database Performance, Salesforce API, Linux, Database Replication, Data Recovery, Exploratory Data Analysis, System Administration, High Availability Disaster Recovery (HADR), Amazon Virtual Private Cloud (VPC), Azure Data Factory, Google Cloud Platform (GCP), Visualization, NumPy, Microsoft Power BI, Data Engineering, Data Analysis, Predictive Analytics, ETL, Cloud Infrastructure, MySQL DBA, Big Data Architecture, Database Architecture, Azure SQL, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), MuleSoft, Reporting, Analytics, Infrastructure, Cloud Storage, PySpark, Data Cleansing, Data Reporting, Data Governance, Pipelines, Database Design, Software, NoSQL, Systems Monitoring, ELK (Elastic Stack), Data Feeds, Data Extraction, Cloud, Data Management, Data Scraping, Scraping, English

Data Engineer

2022 - 2022
Essentially AI Pvt. Ltd.
  • Designed and built a data architecture to ingest and clean 162 TB of stock market data for analysis.
  • Built the right data models to cater to stocks changing their names and stocks performing stock splits.
  • Automated the ingestion process through API calls to the vendor and used Amazon S3, Amazon Athena, Amazon Redshift, Apache Airflow, and AWS Glue.
Technologies: SQL, Amazon Web Services (AWS), Data Engineering, Python, Amazon Athena, Amazon SageMaker, Amazon EC2, AWS Lambda, Amazon S3 (AWS S3), Data Analysis, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, Performance Tuning, Database Administration (DBA), Data Architecture, Data Warehouse Design, Architecture, Data Integration, Pandas, AWS IAM, API Integration, IT Automation, Technical Documentation, Documentation, Database Performance, Linux, Data Recovery, Exploratory Data Analysis, System Administration, Windows, High Availability Disaster Recovery (HADR), Apache Spark, AWS CloudFormation, Amazon Virtual Private Cloud (VPC), Visualization, NumPy, Data Scientist, BI Reporting, Snowflake, ETL, Cloud Infrastructure, Spark SQL, Informatica, DevOps, MySQL DBA, Big Data Architecture, Database Architecture, Machine Learning Operations (MLOps), Reporting, Analytics, Infrastructure, Cloud Storage, PySpark, Data Cleansing, Data Reporting, Data Governance, Data Visualization, Pipelines, Database Design, Software, Data Management, English

MongoDB Atlas Data Lake Developer

2022 - 2022
Penny Inc
  • Worked on a cloud-based expense management system. Transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 along with external datasets.
  • Built the data warehouse on AWS Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
  • Implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Technologies: MongoDB, MongoDB Atlas, Amazon S3 (AWS S3), Data Lakes, Python, Node.js, Relational Database Design, AWS Cloud Architecture, Dashboards, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Relational Databases, AWS Glue, Data Migration, Amazon RDS, Database Migration, Data, Microsoft SQL Server, SQL DML, Performance Tuning, SQL, MongoDB Compass, Database Administration (DBA), Architecture, Data Integration, Apache Airflow, Amazon QuickSight, Amazon DynamoDB, API Integration, Writing & Editing, Database Performance, Exploratory Data Analysis, System Administration, Java, Visualization, ETL, Data Architecture, Cloud Infrastructure, DevOps, MySQL DBA, Analytics, Infrastructure, Financial Data Analytics, Data Cleansing, Data Reporting, Data Visualization, Pipelines, Database Design, Software, Data Feeds, Auth0, Scraping, English

Data Engineer

2018 - 2021
Two Degrees Mobile Limited
  • Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrated it with Tableau and a Netezza data warehouse.
  • Designed and rolled out new data pipelines for big data and an enterprise data warehouse and maintained the existing Hadoop and Hortonworks big data environment and ETL pipelines.
  • Supported enterprise data warehouse processes and operations and delivered ad hoc SQL reports.
  • Integrated with different sources, including Amazon S3, Oracle, IBM Netezza, Microsoft SharePoint, and Microsoft Active Directory (AD).
  • Explored opportunities for new data avenues, such as Snowflake and Anaplan.
Technologies: Amazon Athena, AWS Glue, Snowflake, SQL, Netezza, Oracle, Data Warehousing, Data Lakes, Redshift, Data Pipelines, APIs, REST, Amazon RDS, Query Optimization, Partitioning, Databases, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Relational Databases, Data Migration, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), SQL DML, Performance Tuning, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Technical Writing, Data Integration, Microsoft Power Automate, IBM Db2, IBM Tivoli Storage Manager, Apache Spark, Predictive Analytics, ETL, Data Architecture, Cloud Infrastructure, Hadoop, Hortonworks Data Platform (HDP), Informatica, MySQL DBA, WordPress, Machine Learning Operations (MLOps), Directed Acrylic Graphs (DAG), Excel 2016, Data Cleansing, Data Reporting, Pipelines, Google Cloud Storage, Oracle DBA, Database Design, Fivetran, Scraping, English

Data Consultant

2017 - 2018
Teradata
  • Designed and implemented ETL jobs and data management processes across different platforms.
  • Extracted insights from data and delivered reports to high-level decision-makers.
  • Automated data warehouse processes using Unified Data Integrator (a DevOps product) as part of a bank's digital transformation.
Technologies: Data Engineering, Data Analysis, Big Data, SQL, Data Warehousing, Data Pipelines, Google BigQuery, Google Data Studio, Database Architecture, PostgreSQL, MySQL, Relational Databases, Database Schema Design, SaaS, B2B, Dashboard Design, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Business Intelligence (BI), Operations, Teradata, Data Migration, Database Migration, Data, Microsoft SQL Server, SQL DML, Performance Tuning, Data Integration, SQL Server Integration Services (SSIS), IBM Cognos, Netezza, Deep Learning, Machine Learning, Data Architecture, Cloud Infrastructure, Hadoop, Hortonworks Data Platform (HDP), Informatica, Machine Learning Operations (MLOps), Excel 2016, Data Reporting, Pipelines, Teradata DBA, Database Design, Auth0, English

Business Intelligence Analyst

2014 - 2017
Vodafone Group
  • Introduced IBM Infosphere Streams to perform real-time analytics on big data streams.
  • Designed, built, and tested ETL/ELT solutions using dimensional modeling and sound design, performance tuning, and optimization.
  • Implement and manage small to large-scale projects involving multiple systems with focus on performance tuning, optimization and availability to ensure efficiency in the environment.
Technologies: SQL, Amazon Web Services (AWS), ETL, Big Data, Data Pipelines, Data Visualization, Data Analytics, Business Intelligence (BI), Operations, Relational Databases, Data, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), Microsoft SQL Server, SQL DML, Performance Tuning, CDC, Oracle DBA

Data Lake in AWS and Snowflake

Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrate with Tableau and a Netezza Data Warehouse. I started the project from scratch, assessed providers (AWS, GCP, and Azure), and led a POC to compare processing and pricing. In the end, I implemented the pipelines in AWS Glue and Snowflake while using SAP data services to inject data from Netezza.

National Data Warehouse

Served on a huge team of consultants from IBM, Teradata, and Microsoft to build Egypt's first data warehouse. I was actively involved in the following activities:
• Designing and implementing a huge number of ETL jobs and data management processes across different platforms.
• Sourcing and integrating 50+ different data sources from across the country to build a unified data warehouse.
• Extracting insights from data and delivering reports to high-level decision-makers.

Intesa Sanpaolo Bank Data Platform Revamp

Automated data warehouse processes using Unified Data Integrator as part of the bank's digital transformation. I also developed and upgraded several ETL solutions for the bank. This work was part of a Teradata consulting engagement.

Djezzy Postpaid Stream

Built a new postpaid stream from scratch. This involved modeling and mapping existing data into models and tables and ETL development and implementation, which was done in parallel with another big data stream using the Hortonworks platform.

Data Pipeline on AWS

https://www.propertyradar.com
I developed a process in Python to ingest, transform, and clean people's information for concise marketing targeting. Used Amazon Redshift, S3, Glue, Lambda, and Airflow. I also developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. I used Amazon Redshift, S3, Glue, Lambda, and Airflow. I implemented cloud migration for people processes, which reduced the cost and improved the speed—it now runs on AWS instead of MySQL.

Data Architect

http://thepennyinc.com
I worked on a cloud-based expense management system. I transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 and external datasets. I also built the data warehouse on Amazon Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
I also implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.

Languages

SQL, Snowflake, Python, T-SQL (Transact-SQL), SQL DML, Scala, Java, Visual Basic for Applications (VBA), Excel VBA, C#, R, Python 3, Go

Frameworks

Apache Spark, Spark, Hadoop, ADF, .NET

Libraries/APIs

REST APIs, PostgREST, NumPy, Pandas, PySpark, OpenAPI, Salesforce API, Node.js

Tools

Microsoft Power BI, Google Sheets, Microsoft Excel, IBM Cognos, Apache Maven, Amazon Elastic MapReduce (EMR), AWS CloudFormation, Amazon Virtual Private Cloud (VPC), ELK (Elastic Stack), Jira, GitHub, AWS CLI, Microsoft Access, AWS Glue, Amazon Athena, Tableau, Spark SQL, Apache Airflow, BigQuery, Excel 2016, MongoDB Atlas, Cron, Amazon QuickSight, Redshift Spectrum, Looker, AWS IAM, Amazon CloudWatch, Microsoft Power Apps, Power Query, Bloomberg, Domo, Kibana, Microsoft Flow, Auth0, Google Analytics, Amazon SageMaker, Terraform, Wix

Paradigms

ETL, Database Design, Business Intelligence (BI), Data Science, MapReduce, DevOps, REST, B2B, Microservices, Object-oriented Programming (OOP), Testing, Lambda Architecture

Platforms

Amazon Web Services (AWS), Azure, Linux, AIX, Windows, Kubernetes, Apache Flink, Oracle, Hortonworks Data Platform (HDP), Google Cloud Platform (GCP), Apache Kafka, WordPress, Databricks, AWS Lambda, MuleSoft, Microsoft Power Automate, Docker, Azure Synapse, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), AWS IoT, Amazon EC2, Salesforce, Blockchain

Storage

Teradata, Databases, Oracle DBA, Database Architecture, Data Pipelines, PostgreSQL, NoSQL, MongoDB, MySQL, Relational Databases, Database Migration, SQL Server Integration Services (SSIS), Microsoft SQL Server, Database Administration (DBA), Data Integration, Amazon DynamoDB, Database Performance, MariaDB, Database Replication, IBM Db2, SQL Stored Procedures, Apache Hive, Distributed Databases, Netezza, Amazon S3 (AWS S3), SQL Server DBA, Redshift, Amazon Aurora, Google Bigtable, Azure Cosmos DB, Data Lakes, Azure SQL, Google Cloud Storage, SQL Server Analysis Services (SSAS), AWS Data Pipeline Service

Other

Data Engineering, Data Analysis, Data Warehousing, Big Data, MySQL DBA, Data Architecture, Cloud Infrastructure, Data Modeling, Pipelines, Data Analytics, Data Cleansing, Data Warehouse Design, Complex Data Analysis, BI Reporting, Relational Database Design, AWS Cloud Architecture, Query Optimization, Database Schema Design, Dashboards, Operations, Data Migration, Database Optimization, Data, Performance Tuning, MongoDB Compass, Web Scraping, Data-driven Dashboards, Lambda Functions, Technical Writing, Architecture, API Integration, Real Estate, IT Automation, Technical Documentation, Writing & Editing, Documentation, Data Recovery, Exploratory Data Analysis, System Administration, High Availability Disaster Recovery (HADR), Back-end Development, Azure Data Factory, Visualization, Data Scientist, Azure Data Lake, Cloud Platforms, Distributed Systems, Systems Monitoring, Entity Relationships, Microsoft Azure, Data Feeds, Data Extraction, Delta Lake, Cloud, Key Performance Indicators (KPIs), Data Management, SSH, Virtual Machines, Azure Virtual Machines, Web Dashboards, Fivetran, Business Requirements, Data Transformation, Real-time Data, Data Scraping, Scraping, Data Quality, Property Management, Property Management System Integrations, Data Mining, English, Predictive Analytics, Machine Learning, Deep Learning, Informatica, Teradata DBA, Big Data Architecture, Forecasting, Financial Modeling, Data Visualization, Data Governance, Data Reporting, Financial Data Analytics, Data Quality Analysis, Cloud Storage, Infrastructure, Analytics, Reporting, Machine Learning Operations (MLOps), Google BigQuery, Google Data Studio, APIs, Amazon RDS, Partitioning, SaaS, Dashboard Design, CDC, Excel 365, Data Build Tool (dbt), eCommerce, Excel Macros, DAX, IBM Tivoli Storage Manager, Consumer Packaged Goods (CPG), Azure Databricks, Geospatial Data, CRM APIs, RESTful Microservices, OCR, MacPractice, SAP CRM, Healthcare IT, CI/CD Pipelines, Artificial Intelligence (AI), Web3, Metabase, Entity-relationships Model (ERM), Software, Computer Science, Revenue & Expense Projections, GAAP, Directed Acrylic Graphs (DAG), Google Search Console, Streaming, Leadership, Amazon Neptune, HubSpot, NetSuite, Microsoft Fabric, Monte Carlo

2018 - 2021

Master of Science Degree in Computer Engineering

Cairo University - Cairo, Egypt

2009 - 2013

Bachelor's Degree in Computer Engineering

Cairo University - Cairo, Egypt

AUGUST 2022 - PRESENT

AWS Well-Architected Framework

AWS

APRIL 2022 - PRESENT

Data Engineering Nano Degree

Udacity

MAY 2021 - PRESENT

Data Analysis Professional Nanodegree

Udacity

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring