Amr Saleh
Verified Expert in Engineering
Data Engineer and Developer
Amr is an expert in data architecture and engineering with over 12 years of global expertise in cloud-centric data solutions. He has collaborated with top-tier entities worldwide in fintech, healthcare, banking, and telecom industries. He is skilled in AWS stack, snowflake, NoSQL databases, SQL, Python, PySpark, Power BI, QuickSight, Hadoop, NiFi, PowerApps, MongoDB, DynamoDB, and more. He holds an MSc in data science and offers tailored data engineering courses, empowering global enterprises.
Portfolio
Experience
Availability
Preferred Environment
SQL, Amazon Web Services (AWS), Snowflake, Microsoft Power BI, Big Data, Databases, Python, Azure, Amazon DynamoDB, MongoDB, Monte Carlo, Denodo
The most amazing...
...thing I've done was join Ricoh in the US as senior data architect and improved their data pipelines by reducing 92% of the resources and time needed.
Work Experience
Senior Database & CRM Developer
GrayHawk Health Inc.
- Built CRM for the company on PowerApps to automate the manual processes and enable the sales and operations to run smoothly on a robust CRM implementation.
- Offered 24/7 assistance for staff on CRM and data-related issues and offered constant improvements to the platform by building new forms and automations.
- Advised the business on building its first data warehouse.
- Assisted in choosing the right vendor to build the company's CRM and DWH, based on cost and SOW, and assisted in writing the SOW for the CRM and DWH Vendors.
- Managed the DWH and CRM vendors during the planning and implementation phases.
Senior DW & BI Developer
Modus Operadi
- Automated all CEO and executive team reports to minimize the reliance on manual report running and eliminate human errors while scheduling the reports to run at specific times each day, fitting with the executive teams' requirements.
- Built data warehouse from scratch, starting from the architecture.
- Architected specific pipelines to flatten the data from non-structured data in MongoDB to structured data in the RedShift data warehouse as part of their new data warehouse.
Senior Data Architect and Engineer
Ricoh Corporation - Intelligent Business Platform
- Automated Centene reports and data pipelines saving +90% run-time and resources and eliminating human errors.
- Worked on building the microservices platform and completed +2.5x more tasks than the other developers assigned to the same project.
- Proposed critical database schema modifications were implemented to cover all UX scenarios.
- Advised Ricoh IBP on scaling their platform efficiently and assisted with the architecture upgrade to microservices from monolith architecture.
- Automated their manual reports to run through AWS pipelines and automatically send Excel export to the clients' email addresses, saving lots of time on manual reporting.
- Optimized the most used SQL queries and improved the overall performance of the database.
- Built efficient Lambda functions for data ingestion, transformation, reporting, and migration following industry standards.
- Developed back-end code for microservices architecture by building Lambda functions, APIs, and stored procedures on MySQL and MS SQL databases.
Data Expert
Broad Solutions
- Took over the system with no documentation or handover, managed to understand and document the whole architecture with all running processes and their schedule.
- Architected system backup for processes, schedules, and data for disaster recovery.
- Offered 24/7 swift support for data and pipeline issues in Snowflake and APIs.
DT Technology
Lyticshub
- Put together and led a team of software developers and data engineers and trained them with the latest cloud technology skills.
- Built the data architecture for multiple clients and data pipelines on AWS and Azure.
- Suggested architectural changes that saved 40% of the cloud cost for one of our clients.
Senior Data Engineer
PropertyRadar
- Developed a process in Python to ingest, transform, and clean people information for concise marketing targeting. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
- Developed a process in Python to ingest, transform, and clean property tax information for better market analysis and visualization. Used Amazon Redshift, Amazon S3, AWS Glue, AWS Lambda, and Apache Airflow.
- Implemented cloud migration for people process, which reduced the cost and improved the speed and now runs on AWS instead of MySQL.
Lead Trainer
Sprints
- Trained over ten cohorts of professionals to enter the data engineering market.
- Helped the Telecom Egypt technical team increase their data-related capabilities.
- Led the team to design and deliver the curriculum for data engineering.
Data Lead
Accident Compensation Corporation
- Developed a data model for migrating an old CRM to Salesforce.
- Built and validated data pipelines for data migration and built a data dictionary.
- Built and automated data validation on the new CRM.
Data Engineer
Essentially AI Pvt. Ltd.
- Designed and built a data architecture to ingest and clean 162 TB of stock market data for analysis.
- Built the right data models to cater to stocks changing their names and stocks performing stock splits.
- Automated the ingestion process through API calls to the vendor and used Amazon S3, Amazon Athena, Amazon Redshift, Apache Airflow, and AWS Glue.
MongoDB Atlas Data Lake Developer
Penny Inc
- Worked on a cloud-based expense management system. Transformed MongoDB unstructured data into a structured form and pushed a data stream to AWS S3 along with external datasets.
- Built the data warehouse on AWS Redshift and a presentation dashboard in QuickSight utilizing the AWS stack, including S3, Lambda, Redshift, QuickSight, and AWS Transfer Family.
- Implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Data Engineer
Two Degrees Mobile Limited
- Built a data lake in AWS Cloud and Snowflake to substitute an on-premise Hadoop cluster and integrated it with Tableau and a Netezza data warehouse.
- Designed and rolled out new data pipelines for big data and an enterprise data warehouse and maintained the existing Hadoop and Hortonworks big data environment and ETL pipelines.
- Supported enterprise data warehouse processes and operations and delivered ad hoc SQL reports.
- Integrated with different sources, including Amazon S3, Oracle, IBM Netezza, Microsoft SharePoint, and Microsoft Active Directory (AD).
- Explored opportunities for new data avenues, such as Snowflake and Anaplan.
Data Consultant
Teradata
- Designed and implemented ETL jobs and data management processes across different platforms.
- Extracted insights from data and delivered reports to high-level decision-makers.
- Automated data warehouse processes using Unified Data Integrator (a DevOps product) as part of a bank's digital transformation.
Business Intelligence Analyst
Vodafone Group
- Introduced IBM Infosphere Streams to perform real-time analytics on big data streams.
- Designed, built, and tested ETL/ELT solutions using dimensional modeling and sound design, performance tuning, and optimization.
- Implement and manage small to large-scale projects involving multiple systems with focus on performance tuning, optimization and availability to ensure efficiency in the environment.
Experience
Data Lake in AWS and Snowflake
National Data Warehouse
• Designing and implementing a huge number of ETL jobs and data management processes across different platforms.
• Sourcing and integrating 50+ different data sources from across the country to build a unified data warehouse.
• Extracting insights from data and delivering reports to high-level decision-makers.
Intesa Sanpaolo Bank Data Platform Revamp
Djezzy Postpaid Stream
Data Pipeline on AWS
https://www.propertyradar.comData Architect
http://thepennyinc.comI also implemented code scheduling and verification on Airflow, as well as data quality checks and code documentation in GitHub.
Skillset
Languages
SQL, Snowflake, Python, T-SQL (Transact-SQL), SQL DML, Scala, Java, JavaScript, Visual Basic for Applications (VBA), Excel VBA, C#, R, Python 3, Go
Frameworks
Spark, Apache Spark, Hadoop, ADF, .NET
Libraries/APIs
Node.js, REST APIs, PostgREST, NumPy, Pandas, PySpark, OpenAPI, Salesforce API
Tools
AWS Glue, Microsoft Power BI, Google Sheets, Microsoft Excel, Google Analytics, Amazon QuickSight, IBM Cognos, Apache Maven, Amazon Elastic MapReduce (EMR), AWS CloudFormation, Amazon Virtual Private Cloud (VPC), ELK (Elastic Stack), Jira, GitHub, AWS CLI, Microsoft Access, AWS Step Functions, Apache Beam, Talend ETL, Microsoft Report Builder, Informatica ETL, Informatica PowerCenter, Amazon Athena, Tableau, Spark SQL, Apache Airflow, BigQuery, Excel 2016, MongoDB Atlas, Cron, Amazon Redshift Spectrum, Looker, AWS IAM, Amazon CloudWatch, Microsoft Power Apps, Power Query, Bloomberg, Domo, Kibana, Microsoft Flow, Auth0, Superset, Amazon SageMaker, Terraform, Wix
Paradigms
ETL, Database Design, Business Intelligence (BI), Data Science, MapReduce, Agile, DevOps, REST, B2B, Microservices, Object-oriented Programming (OOP), Testing, Lambda Architecture
Platforms
Amazon Web Services (AWS), Azure, Databricks, AWS Lambda, Linux, AIX, Windows, Kubernetes, Apache Flink, Oracle, Hortonworks Data Platform (HDP), Google Cloud Platform (GCP), Apache Kafka, WordPress, MuleSoft, Microsoft Power Automate, Docker, Azure Synapse, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), Denodo, AWS IoT, Amazon EC2, Salesforce, Blockchain, HubSpot, Microsoft Fabric
Storage
Teradata, Databases, Oracle DBA, Database Architecture, Data Pipelines, PostgreSQL, NoSQL, MongoDB, MySQL, Relational Databases, Database Migration, SQL Server Integration Services (SSIS), Microsoft SQL Server, Database Administration (DBA), Data Integration, Amazon DynamoDB, Database Performance, MariaDB, Database Replication, IBM Db2, SQL Stored Procedures, Apache Hive, Distributed Databases, SQL Server 2012, Azure Active Directory, Database Caching, Netezza, Amazon S3 (AWS S3), SQL Server DBA, Redshift, Amazon Aurora, Google Bigtable, Azure Cosmos DB, Oracle PL/SQL, Google Cloud, Neo4j, Graph Databases, Data Lakes, Azure SQL, Google Cloud Storage, SQL Server Analysis Services (SSAS), AWS Data Pipeline Service
Industry Expertise
Retail & Wholesale
Other
Data Engineering, Data Analysis, Data Warehousing, Big Data, MySQL DBA, Data Architecture, Cloud Infrastructure, Data Modeling, Pipelines, Data Analytics, Data Cleansing, Data Warehouse Design, Complex Data Analysis, BI Reporting, Relational Database Design, AWS Cloud Architecture, Query Optimization, Database Schema Design, Dashboards, Operations, Data Migration, Database Optimization, Data, Performance Tuning, MongoDB Compass, Web Scraping, Data-driven Dashboards, Lambda Functions, Technical Writing, Architecture, API Integration, Real Estate, IT Automation, Technical Documentation, Writing & Editing, Documentation, Data Recovery, Exploratory Data Analysis, System Administration, High Availability Disaster Recovery (HADR), Back-end Development, Azure Data Factory, Visualization, Data Scientist, Azure Data Lake, Cloud Platforms, Distributed Systems, Systems Monitoring, Entity Relationships, Microsoft Azure, Data Feeds, Data Extraction, Delta Lake, Cloud, Key Performance Indicators (KPIs), Data Management, SSH, Virtual Machines, Azure Virtual Machines, Web Dashboards, Fivetran, Business Requirements, Data Transformation, Real-time Data, Data Scraping, Scraping, Data Quality, Property Management, Property Management System Integrations, Data Mining, English, Sharding, SAP, CSV Export, Oracle ERP Cloud, Data Structures, Cloud Migration, Data Strategy, ETL Tools, Consulting, Advisory, Data Processing, Large-scale Projects, Financial Services, Technical Leadership, Teamwork, ELT, Full-stack, Data-level Security, AWS Quicksite, Predictive Analytics, Machine Learning, Deep Learning, Informatica, Teradata DBA, Big Data Architecture, Forecasting, Financial Modeling, Data Visualization, Data Governance, Data Reporting, Financial Data Analytics, Data Quality Analysis, Cloud Storage, Infrastructure, Analytics, Reporting, Machine Learning Operations (MLOps), Google BigQuery, Google Data Studio, APIs, Amazon RDS, Partitioning, SaaS, Dashboard Design, CDC, Excel 365, Data Build Tool (dbt), eCommerce, Excel Macros, DAX, IBM Tivoli Storage Manager, Consumer Packaged Goods (CPG), Azure Databricks, Geospatial Data, CRM APIs, RESTful Microservices, OCR, MacPractice, SAP CRM, Healthcare IT, CI/CD Pipelines, Artificial Intelligence (AI), Web3, Metabase, Back-end, Oracle EBS, Oracle Financials Cloud, Oracle Fusion Applications, Web Analytics, Clickstream, Social Media Web Traffic, Advertising Technology (Adtech), Digital Marketing, Entity-relationships Model (ERM), Software, Computer Science, Revenue & Expense Projections, GAAP, Directed Acrylic Graphs (DAG), Google Search Console, Streaming, Leadership, Amazon Neptune, NetSuite, Monte Carlo
Education
Master of Science Degree in Computer Engineering
Cairo University - Cairo, Egypt
Bachelor's Degree in Computer Engineering
Cairo University - Cairo, Egypt
Certifications
AWS Well-Architected Framework
AWS
Data Engineering Nano Degree
Udacity
Data Analysis Professional Nanodegree
Udacity
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring