Amr Saleh, Developer in Auckland, New Zealand

Amr Saleh

Data Engineer and Developer

Location
Auckland, New Zealand
Toptal Member Since
October 7, 2021

Amr is a data architect/engineer with 10+ years of international experience, particularly in the cloud space. He's been a consultant at Teradata and worked with telecom operators, banks, and government organizations in Europe and the Middle East. Amr's skills include AWS Glue, Athena, CloudFormation, S3, and Snowflake; SQL, HQL, Python, and PySpark; Power BI, Tableau, and QuickSight; and Hadoop and NiFi. Amr has an MSc in data science and teaches data engineering classes to enterprises.

Amr is available for hire
Hire Amr

Portfolio

Ricoh Corporation - Intelligent Business Platform
SQL, NoSQL, Data Engineering, Data Architecture, Amazon Web Services (AWS)...
Sprints
Amazon Web Services (AWS), Hadoop, SQL, Snowflake, Data Warehousing...
PropertyRadar
ETL, Data Lakes, Amazon Web Services (AWS), Apache Kafka, Python, Python 3...

Location

Auckland, New Zealand

Availability

Part-time

Preferred Environment

SQL, Amazon Web Services (AWS), Snowflake, Microsoft Power BI, Big Data, Databases, Python, Redshift, Google Cloud Platform (GCP), Azure, OpenAPI

The most amazing...

...experience was leading the data architecture, design, and implementation of Lyticshub—from the initial startup until Vodafone became the first customer.

Work Experience

2023 - PRESENT

Senior Data Architect and Engineer

Ricoh Corporation - Intelligent Business Platform
  • Advised Ricoh IBP on scaling their platform efficiently and assisting with the architecture upgrade.
  • Optimized the most used SQL queries and improved the overall performance of the database.
  • Built efficient lambda functions for data ingestion, transformation, reporting, and migration following industry standards.
Technologies: SQL, NoSQL, Data Engineering, Data Architecture, Amazon Web Services (AWS), MySQL, Amazon Athena, Amazon S3 (AWS S3), AWS Lambda, Lambda Architecture, Lambda Functions, Redshift, Redshift Spectrum, MuleSoft, Databases, Data Lakes, Data Warehouse Design, Architecture, Data Integration, Microservices, Amazon DynamoDB, Pandas, Excel Macros, AWS IAM, API Integration, IT Automation, Technical Documentation, REST APIs, Writing & Editing, Documentation, Database Performance, Microsoft Power Apps, Microsoft Power Automate
2020 - PRESENT

Lead Trainer

Sprints
  • Trained nine cohorts of professionals to enter the data engineering market.
  • Helped the Telecom Egypt technical team increase their data-related capabilities.
  • Led the team to design and deliver the curriculum for data engineering.
Technologies: Amazon Web Services (AWS), Hadoop, SQL, Snowflake, Data Warehousing, Hortonworks Data Platform (HDP), Redshift, Google Analytics, Azure SQL, Data Analytics, AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Amazon Athena, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), Microsoft SQL Server, SQL DML, Amazon Neptune, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Data Architecture, Excel 365, Data Warehouse Design, Google BigQuery, Looker, Architecture, Data Integration, Microservices, Apache Airflow, Pandas, Tableau, AWS IAM, SQL Server Integration Services (SSIS), IT Automation, REST APIs, Documentation, DAX
2022 - 2023

Senior Data Engineer

PropertyRadar
  • Developed a process in Python to ingest, transform, and clean people information for concise marketing targeting. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
  • Developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
  • Implemented cloud migration for people process, which reduced the cost and improved the speed and now runs on AWS instead of MySQL.
Technologies: ETL, Data Lakes, Amazon Web Services (AWS), Apache Kafka, Python, Python 3, Spark, Amazon S3 (AWS S3), Data Engineering, Streaming, Databricks, Amazon Athena, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, SQL DML, Performance Tuning, SQL, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Data Architecture, Data Warehouse Design, Architecture, Data Integration, Amazon QuickSight, Pandas, AWS IAM, API Integration, Real Estate, IT Automation, Technical Documentation, Documentation, Database Performance, Power Query
2022 - 2022

Data Lead

Accident Compensation Corporation
  • Developed a data model for migrating an old CRM to Salesforce.
  • Built and validated data pipelines for data migration and built a data dictionary.
  • Built and automated data validation on the new CRM.
Technologies: Data Modeling, Snowflake, Amazon Web Services (AWS), SQL, Python, Testing, Relational Databases, AWS Glue, Data Migration, Salesforce, Database Optimization, Database Migration, Data, Performance Tuning, Database Administration (DBA), Data-driven Dashboards, Data Architecture, Technical Writing, Data Warehouse Design, Architecture, Data Integration, Pandas, AWS IAM, IT Automation, Technical Documentation, Documentation, Database Performance, Salesforce API
2022 - 2022

Data Engineer

Essentially AI Pvt. Ltd.
  • Designed and built a data architecture to ingest and clean 162 TB of stock market data for analysis.
  • Built the right data models to cater to stocks changing their names and stocks performing stock splits.
  • Automated the ingestion process through API calls to the vendor and used Amazon S3, Amazon Athena, Amazon Redshift, Apache Airflow, and AWS Glue.
Technologies: SQL, Amazon Web Services (AWS), Data Engineering, Python, Amazon Athena, Amazon SageMaker, Amazon EC2, AWS Lambda, Amazon S3 (AWS S3), Data Analysis, Relational Databases, AWS Glue, Data Migration, Database Optimization, Database Migration, Data, Performance Tuning, Database Administration (DBA), Data Architecture, Data Warehouse Design, Architecture, Data Integration, Pandas, AWS IAM, API Integration, IT Automation, Technical Documentation, Documentation, Database Performance
2022 - 2022

Data Principle

Kiwilytics Ltd.
  • Built and led a team of software developers and data engineers and trained them with the latest cloud technology skills.
  • Built the data architecture for multiple clients and data pipelines on AWS and Azure.
  • Suggested architectural changes that saved 40% of the cloud cost for one of our clients.
Technologies: Data Engineering, Data Architecture, Leadership, Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), SQL, Python, Amazon Athena, MongoDB, MongoDB Atlas, Relational Databases, Snowflake, AWS Glue, Data Migration, Amazon RDS, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), Performance Tuning, MongoDB Compass, Web Scraping, Database Administration (DBA), Excel 365, Technical Writing, Data Warehouse Design, Google BigQuery, Looker, Data Build Tool (dbt), Architecture, Data Integration, Apache Airflow, Amazon QuickSight, Amazon DynamoDB, Pandas, eCommerce, Tableau, AWS IAM, Amazon CloudWatch, AWS Data Pipeline Service, API Integration, SQL Server Integration Services (SSIS), IT Automation, Technical Documentation, REST APIs, Writing & Editing, Documentation, Database Performance, Microsoft Power Apps, Microsoft Power Automate, DAX, Power Query
2022 - 2022

MongoDB Atlas Data Lake Developer

Penny Inc
  • Worked on a cloud-based expense management system. Transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 along with external datasets.
  • Built the data warehouse on AWS Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
  • Implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Technologies: MongoDB, MongoDB Atlas, Amazon S3 (AWS S3), Data Lakes, Python, Node.js, Relational Database Design, AWS Cloud Architecture, Dashboards, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Relational Databases, AWS Glue, Data Migration, Amazon RDS, Database Migration, Data, Microsoft SQL Server, SQL DML, Performance Tuning, SQL, MongoDB Compass, Database Administration (DBA), Architecture, Data Integration, Apache Airflow, Amazon QuickSight, Amazon DynamoDB, API Integration, Writing & Editing, Database Performance
2018 - 2021

Data Engineer

Two Degrees Mobile Limited
  • Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrated it with Tableau and a Netezza data warehouse.
  • Designed and rolled out new data pipelines for big data and an enterprise data warehouse and maintained the existing Hadoop and Hortonworks big data environment and ETL pipelines.
  • Supported enterprise data warehouse processes and operations and delivered ad hoc SQL reports.
  • Integrated with different sources, including Amazon S3, Oracle, IBM Netezza, Microsoft SharePoint, and Microsoft Active Directory (AD).
  • Explored opportunities for new data avenues, such as Snowflake and Anaplan.
Technologies: Amazon Athena, AWS Glue, Snowflake, SQL, Netezza, Oracle, Data Warehousing, Data Lakes, Redshift, Data Pipelines, APIs, REST, Amazon RDS, Query Optimization, Partitioning, Databases, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Database Architecture, Business Intelligence (BI), Operations, Relational Databases, Data Migration, Database Optimization, Database Migration, Data, T-SQL (Transact-SQL), SQL DML, Performance Tuning, Database Administration (DBA), Amazon Aurora, Data-driven Dashboards, Technical Writing, Data Integration, Microsoft Power Automate
2017 - 2018

Data Consultant

Teradata
  • Designed and implemented ETL jobs and data management processes across different platforms.
  • Extracted insights from data and delivered reports to high-level decision-makers.
  • Automated data warehouse processes using Unified Data Integrator (a DevOps product) as part of a bank's digital transformation.
Technologies: Data Engineering, Data Analysis, Big Data, SQL, Data Warehousing, Data Pipelines, Google BigQuery, Google Data Studio, Database Architecture, PostgreSQL, MySQL, Relational Databases, Database Schema Design, SaaS, B2B, Dashboard Design, Data Analytics, Amazon Web Services (AWS), AWS Lambda, Cron, Business Intelligence (BI), Operations, Teradata, Data Migration, Database Migration, Data, Microsoft SQL Server, SQL DML, Performance Tuning, Data Integration, SQL Server Integration Services (SSIS)
2014 - 2017

Business Intelligence Analyst

Vodafone Group
  • Introduced IBM Infosphere Streams to perform real-time analytics on big data streams.
  • Designed, built, and tested ETL/ELT solutions using dimensional modeling and sound design, performance tuning, and optimization.
  • Implement and manage small to large-scale projects involving multiple systems with focus on performance tuning, optimization and availability to ensure efficiency in the environment.
Technologies: SQL, Amazon Web Services (AWS), ETL, Big Data, Data Pipelines, Data Visualization, Data Analytics, Business Intelligence (BI), Operations, Relational Databases, Data, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), Microsoft SQL Server, SQL DML, Performance Tuning, CDC

Experience

Data Lake in AWS and Snowflake

Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrate with Tableau and a Netezza Data Warehouse. I started the project from scratch, assessed providers (AWS, GCP, and Azure), and led a POC to compare processing and pricing. In the end, I implemented the pipelines in AWS Glue and Snowflake while using SAP data services to inject data from Netezza.

National Data Warehouse

Served on a huge team of consultants from IBM, Teradata, and Microsoft to build Egypt's first data warehouse. I was actively involved in the following activities:
• Designing and implementing a huge number of ETL jobs and data management processes across different platforms.
• Sourcing and integrating 50+ different data sources from across the country to build a unified data warehouse.
• Extracting insights from data and delivering reports to high-level decision-makers.

Intesa Sanpaolo Bank Data Platform Revamp

Automated data warehouse processes using Unified Data Integrator as part of the bank's digital transformation. I also developed and upgraded several ETL solutions for the bank. This work was part of a Teradata consulting engagement.

Djezzy Postpaid Stream

Built a new postpaid stream from scratch. This involved modeling and mapping existing data into models and tables and ETL development and implementation, which was done in parallel with another big data stream using the Hortonworks platform.

Data Pipeline on AWS

https://www.propertyradar.com
I developed a process in Python to ingest, transform, and clean people's information for concise marketing targeting. Used Amazon Redshift, S3, Glue, Lambda, and Airflow. I also developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. I used Amazon Redshift, S3, Glue, Lambda, and Airflow. I implemented cloud migration for people processes, which reduced the cost and improved the speed—it now runs on AWS instead of MySQL.

Data Architect

http://thepennyinc.com
I worked on a cloud-based expense management system. I transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 and external datasets. I also built the data warehouse on Amazon Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
I also implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.

Skills

Languages

SQL, Snowflake, Python, T-SQL (Transact-SQL), SQL DML, R, Python 3

Libraries/APIs

REST APIs, Pandas, PySpark, OpenAPI, Salesforce API, Node.js

Tools

Microsoft Power BI, Google Sheets, Microsoft Excel, AWS Glue, Amazon Athena, Tableau, Apache Airflow, Excel 2016, MongoDB Atlas, Cron, Amazon QuickSight, Looker, AWS IAM, Amazon CloudWatch, Microsoft Power Apps, Power Query, Spark SQL, BigQuery, Google Analytics, Amazon SageMaker, Redshift Spectrum

Paradigms

ETL, Database Design, Business Intelligence (BI), Data Science, REST, B2B, Microservices, DevOps, Object-oriented Programming (OOP), Testing, Lambda Architecture

Platforms

Amazon Web Services (AWS), Oracle, Hortonworks Data Platform (HDP), Azure, WordPress, AWS Lambda, Google Cloud Platform (GCP), Apache Kafka, AWS IoT, Databricks, Amazon EC2, Salesforce, MuleSoft

Storage

Databases, Database Architecture, Data Pipelines, PostgreSQL, MongoDB, MySQL, Relational Databases, Database Migration, SQL Server Integration Services (SSIS), Microsoft SQL Server, Database Administration (DBA), Data Integration, Amazon DynamoDB, Database Performance, Netezza, Teradata, Amazon S3 (AWS S3), Oracle DBA, SQL Server DBA, Redshift, Amazon Aurora, Data Lakes, Azure SQL, Google Cloud Storage, NoSQL, SQL Server Analysis Services (SSAS), AWS Data Pipeline Service

Other

Data Engineering, Data Analysis, Data Warehousing, Data Architecture, Data Modeling, Pipelines, Data Analytics, Data Cleansing, Data Warehouse Design, Complex Data Analysis, BI Reporting, Relational Database Design, AWS Cloud Architecture, Query Optimization, Database Schema Design, Dashboards, Operations, Data Migration, Database Optimization, Data, Performance Tuning, MongoDB Compass, Data-driven Dashboards, Technical Writing, Architecture, API Integration, Real Estate, IT Automation, Technical Documentation, Writing & Editing, Documentation, Big Data, MySQL DBA, Teradata DBA, Big Data Architecture, Cloud Infrastructure, Forecasting, Financial Modeling, Data Visualization, Data Governance, Data Reporting, Financial Data Analytics, Data Quality Analysis, Cloud Storage, Infrastructure, Analytics, Reporting, Google BigQuery, APIs, Amazon RDS, Partitioning, SaaS, Dashboard Design, CDC, Excel 365, Data Build Tool (dbt), eCommerce, Excel Macros, Microsoft Power Automate, DAX, Predictive Analytics, Machine Learning, Deep Learning, Informatica, Entity-relationships Model (ERM), Software, Computer Science, Revenue & Expense Projections, GAAP, Directed Acrylic Graphs (DAG), Machine Learning Operations (MLOps), Google Search Console, Google Data Studio, Streaming, Leadership, Amazon Neptune, Web Scraping, Lambda Functions

Frameworks

Apache Spark, Spark, Hadoop, .NET

Education

2018 - 2021

Master of Science Degree in Computer Engineering

Cairo University - Cairo, Egypt

2009 - 2013

Bachelor's Degree in Computer Engineering

Cairo University - Cairo, Egypt

Certifications

AUGUST 2022 - PRESENT

AWS Well-Architected Framework

AWS

MAY 2021 - PRESENT

Data Analysis Professional Nanodegree

Udacity