
Amr Saleh
Data Engineer and Developer
Amr is a data architect/engineer with 10+ years of international experience, particularly in the cloud space. He's been a consultant at Teradata and worked with telecom operators, banks, and government organizations in Europe and the Middle East. Amr's skills include AWS Glue, Athena, CloudFormation, S3, and Snowflake; SQL, HQL, Python, and PySpark; Power BI, Tableau, and QuickSight; and Hadoop and NiFi. Amr has an MSc in data science and teaches data engineering classes to enterprises.
Portfolio
Availability
Preferred Environment
SQL, Amazon Web Services (AWS), Snowflake, Microsoft Power BI, Big Data, Databases, Python, Redshift, Google Cloud Platform (GCP), Azure, OpenAPI
The most amazing...
...experience was leading the data architecture, design, and implementation of Lyticshub—from the initial startup until Vodafone became the first customer.
Work Experience
Senior Data Architect and Engineer
Ricoh Corporation - Intelligent Business Platform
- Advised Ricoh IBP on scaling their platform efficiently and assisting with the architecture upgrade.
- Optimized the most used SQL queries and improved the overall performance of the database.
- Built efficient lambda functions for data ingestion, transformation, reporting, and migration following industry standards.
Lead Trainer
Sprints
- Trained nine cohorts of professionals to enter the data engineering market.
- Helped the Telecom Egypt technical team increase their data-related capabilities.
- Led the team to design and deliver the curriculum for data engineering.
Senior Data Engineer
PropertyRadar
- Developed a process in Python to ingest, transform, and clean people information for concise marketing targeting. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
- Developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
- Implemented cloud migration for people process, which reduced the cost and improved the speed and now runs on AWS instead of MySQL.
Data Lead
Accident Compensation Corporation
- Developed a data model for migrating an old CRM to Salesforce.
- Built and validated data pipelines for data migration and built a data dictionary.
- Built and automated data validation on the new CRM.
Data Engineer
Essentially AI Pvt. Ltd.
- Designed and built a data architecture to ingest and clean 162 TB of stock market data for analysis.
- Built the right data models to cater to stocks changing their names and stocks performing stock splits.
- Automated the ingestion process through API calls to the vendor and used Amazon S3, Amazon Athena, Amazon Redshift, Apache Airflow, and AWS Glue.
Data Principle
Kiwilytics Ltd.
- Built and led a team of software developers and data engineers and trained them with the latest cloud technology skills.
- Built the data architecture for multiple clients and data pipelines on AWS and Azure.
- Suggested architectural changes that saved 40% of the cloud cost for one of our clients.
MongoDB Atlas Data Lake Developer
Penny Inc
- Worked on a cloud-based expense management system. Transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 along with external datasets.
- Built the data warehouse on AWS Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
- Implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Data Engineer
Two Degrees Mobile Limited
- Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrated it with Tableau and a Netezza data warehouse.
- Designed and rolled out new data pipelines for big data and an enterprise data warehouse and maintained the existing Hadoop and Hortonworks big data environment and ETL pipelines.
- Supported enterprise data warehouse processes and operations and delivered ad hoc SQL reports.
- Integrated with different sources, including Amazon S3, Oracle, IBM Netezza, Microsoft SharePoint, and Microsoft Active Directory (AD).
- Explored opportunities for new data avenues, such as Snowflake and Anaplan.
Data Consultant
Teradata
- Designed and implemented ETL jobs and data management processes across different platforms.
- Extracted insights from data and delivered reports to high-level decision-makers.
- Automated data warehouse processes using Unified Data Integrator (a DevOps product) as part of a bank's digital transformation.
Business Intelligence Analyst
Vodafone Group
- Introduced IBM Infosphere Streams to perform real-time analytics on big data streams.
- Designed, built, and tested ETL/ELT solutions using dimensional modeling and sound design, performance tuning, and optimization.
- Implement and manage small to large-scale projects involving multiple systems with focus on performance tuning, optimization and availability to ensure efficiency in the environment.
Experience
Data Lake in AWS and Snowflake
National Data Warehouse
• Designing and implementing a huge number of ETL jobs and data management processes across different platforms.
• Sourcing and integrating 50+ different data sources from across the country to build a unified data warehouse.
• Extracting insights from data and delivering reports to high-level decision-makers.
Intesa Sanpaolo Bank Data Platform Revamp
Djezzy Postpaid Stream
Data Pipeline on AWS
https://www.propertyradar.comData Architect
http://thepennyinc.comI also implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Skills
Languages
SQL, Snowflake, Python, T-SQL (Transact-SQL), SQL DML, R, Python 3
Libraries/APIs
REST APIs, Pandas, PySpark, OpenAPI, Salesforce API, Node.js
Tools
Microsoft Power BI, Google Sheets, Microsoft Excel, AWS Glue, Amazon Athena, Tableau, Apache Airflow, Excel 2016, MongoDB Atlas, Cron, Amazon QuickSight, Looker, AWS IAM, Amazon CloudWatch, Microsoft Power Apps, Power Query, Spark SQL, BigQuery, Google Analytics, Amazon SageMaker, Redshift Spectrum
Paradigms
ETL, Database Design, Business Intelligence (BI), Data Science, REST, B2B, Microservices, DevOps, Object-oriented Programming (OOP), Testing, Lambda Architecture
Platforms
Amazon Web Services (AWS), Oracle, Hortonworks Data Platform (HDP), Azure, WordPress, AWS Lambda, Google Cloud Platform (GCP), Apache Kafka, AWS IoT, Databricks, Amazon EC2, Salesforce, MuleSoft
Storage
Databases, Database Architecture, Data Pipelines, PostgreSQL, MongoDB, MySQL, Relational Databases, Database Migration, SQL Server Integration Services (SSIS), Microsoft SQL Server, Database Administration (DBA), Data Integration, Amazon DynamoDB, Database Performance, Netezza, Teradata, Amazon S3 (AWS S3), Oracle DBA, SQL Server DBA, Redshift, Amazon Aurora, Data Lakes, Azure SQL, Google Cloud Storage, NoSQL, SQL Server Analysis Services (SSAS), AWS Data Pipeline Service
Other
Data Engineering, Data Analysis, Data Warehousing, Data Architecture, Data Modeling, Pipelines, Data Analytics, Data Cleansing, Data Warehouse Design, Complex Data Analysis, BI Reporting, Relational Database Design, AWS Cloud Architecture, Query Optimization, Database Schema Design, Dashboards, Operations, Data Migration, Database Optimization, Data, Performance Tuning, MongoDB Compass, Data-driven Dashboards, Technical Writing, Architecture, API Integration, Real Estate, IT Automation, Technical Documentation, Writing & Editing, Documentation, Big Data, MySQL DBA, Teradata DBA, Big Data Architecture, Cloud Infrastructure, Forecasting, Financial Modeling, Data Visualization, Data Governance, Data Reporting, Financial Data Analytics, Data Quality Analysis, Cloud Storage, Infrastructure, Analytics, Reporting, Google BigQuery, APIs, Amazon RDS, Partitioning, SaaS, Dashboard Design, CDC, Excel 365, Data Build Tool (dbt), eCommerce, Excel Macros, Microsoft Power Automate, DAX, Predictive Analytics, Machine Learning, Deep Learning, Informatica, Entity-relationships Model (ERM), Software, Computer Science, Revenue & Expense Projections, GAAP, Directed Acrylic Graphs (DAG), Machine Learning Operations (MLOps), Google Search Console, Google Data Studio, Streaming, Leadership, Amazon Neptune, Web Scraping, Lambda Functions
Frameworks
Apache Spark, Spark, Hadoop, .NET
Education
Master of Science Degree in Computer Engineering
Cairo University - Cairo, Egypt
Bachelor's Degree in Computer Engineering
Cairo University - Cairo, Egypt
Certifications
AWS Well-Architected Framework
AWS
Data Analysis Professional Nanodegree
Udacity