Mohamed Abulazm, Developer in Linz, Austria
Mohamed is available for hire
Hire Mohamed

Mohamed Abulazm

Verified Expert  in Engineering

Data Engineer and ETL Developer

Location
Linz, Austria
Toptal Member Since
August 23, 2021

Mohamed has a master's degree in data science. He worked as a data engineer and full-stack developer for over seven years. Building cloud-native big data applications are his strongest suit. Mohamed served as the TL on multiple projects in the eCommerce industry, including delivering a Google partner SaaS app used by more than 400,000 monthly users. He showed technical and organizational excellence as TL building military software. He enjoys mentoring developers and presenting technology talks.

Portfolio

Tookitaki Holding Pte Ltd
Scala, Spark, REST APIs, Relational Databases, HBase, HDFS, MariaDB, MySQL...
KM.ON
Apache Kafka, Helm, Docker, Bitbucket, Continuous Integration (CI)...
Smarter Ecommerce GmbH
Akka Streams, Apache Kafka, BigQuery, Kafka Streams, Kubernetes...

Experience

Availability

Part-time

Preferred Environment

JVM, Node.js, Scala, Kubernetes, Google Cloud Platform (GCP), Apache Kafka, ETL, Identity & Access Management (IAM), Software Architecture, Domain-driven Design (DDD)

The most amazing...

...work I've done is designing and implementing a multi-TB data warehouse on GCP leveraging data meshes architecture with E2E security to isolate the 450+ tenants.

Work Experience

Scala Back-end Developer

2022 - 2023
Tookitaki Holding Pte Ltd
  • Led the design and implementation of a real-time financial transactions screening system and scaled the system to handle 200 reqs/s.
  • Implemented different back-end features, APIs, and data pipelines for the company's anti-financial crime solution, e.g., ETL pipelines, real-time notifications, CRUD of other financial entities, etc.
  • Led the introduction and improvement of nonfunction features like unit testing, code refactoring practices, writing functional code, teaching Scala, event-driven architecture, and immutability.
Technologies: Scala, Spark, REST APIs, Relational Databases, HBase, HDFS, MariaDB, MySQL, API Design, API Development, Real-time Systems, Fintech, Apache Kafka, Amazon Web Services (AWS), Docker, Amazon EC2, Elasticsearch, Spark ML

Kafka System Engineer

2021 - 2022
KM.ON
  • Industrialized the installation and DevOps of the Kafka cluster on AWS EKS and Alibaba Cloud ACK. Benchmarked and improved throughput, reliability, and fault tolerance in preparation for the production rollout.
  • Learned to deploy and operate a multi-cloud architecture set up on AWS and Alibaba Cloud and to deliver and deploy software in China to serve Chinese clients.
  • Aligned with the engineering lead and team on the best practices for managing Kafka clusters at scale.
  • Analyzed and implemented GitOps POCs using different technologies, such as ArgoCD or Flux.
Technologies: Apache Kafka, Helm, Docker, Bitbucket, Continuous Integration (CI), Alibaba Cloud, Kubernetes, Git, JVM, Java, Databases, Networking, Microservices Architecture, API Design, Grafana, Prometheus, Event-driven Architecture, Identity & Access Management (IAM), Kustomize, Software Architecture, Confluence, Cloud Architecture, Continuous Deployment, System Design, Cloud Native, Big Data, Data Modeling, SaaS, Amazon EKS, Amazon Web Services (AWS)

Data Engineer

2020 - 2022
Smarter Ecommerce GmbH
  • Spearheaded the integration of Kafka streams used for different data processing pipelines. Reduced running costs by 70% and improved throughput by 350%.
  • Worked on a generic cloud-native big data platform and data pipeline framework on top of Kafka, Akka Streams, Kubernetes, and Scala to ingest and transform hundreds of datasets from various sources.
  • Led the design, implementation, and integration of the data warehouse Google Cloud BigQuery. Defined and used idiomatic cloud patterns for reading/writing data and building ETL/data pipelines in the data platform.
  • Managed the design and adoption of data mesh architecture and data governance. Oversaw the delivery of four data products and their data pipelines.
  • Oversaw the design and implementation of securing and isolating the 450+ tenants on the compute, storage, and billing levels in Google Cloud.
  • Deployed, integrated, and operated Kafka to handle the communication between the different data pipelines. Designed all the operations processes, disaster recovery, and automation for the cluster deployed on Kubernetes.
  • Spearheaded the definition and implementation of the domain data model. Aligned the requirements and constraints of at least five teams that use the storage-agnostic model in many data pipelines.
  • Led the Google Cloud infrastructure automation setup handling cloud projects, Kubernetes clusters, cloud functions, storage (GCS, BigQuery, Kafka), and different security topics like Identity and Access Management (IAM) and workload identity.
Technologies: Akka Streams, Apache Kafka, BigQuery, Kafka Streams, Kubernetes, Google Cloud Functions, Serverless, Event-driven Architecture, Reactive Streams, Data Warehouse Design, Data Warehousing, Scala, Identity & Access Management (IAM), Terraform, Kustomize, Google Cloud Storage, Prometheus, Grafana, Cloudflow, Google Cloud Platform (GCP), Argo CD, TeamCity, Domain-driven Design (DDD), Software Architecture, Data Pipelines, Continuous Integration (CI), Git, REST APIs, JVM, IntelliJ IDEA, Google Cloud, Docker, Data Science, Google Data Studio, Java, Databases, Networking, Microservices Architecture, API Design, API Development, Helm, Confluence, Cloud Architecture, Google BigQuery, GSM, Data Mesh, Data Processing, ETL, ScalaTest, Continuous Deployment, Google Kubernetes Engine (GKE), System Design, Cloud Native, Big Data, Data Modeling, Test-driven Development (TDD), SaaS, SQL, Python, Relational Databases

Software Engineer

2018 - 2020
Smarter Ecommerce GmbH
  • Worked on a SaaS for comparison shopping service used by more than 400,000 monthly users and part of the Google CSS Program.
  • Led the research, design, and implementation of a clustering algorithm and data pipeline that processes more than 100 million products daily from different data sources like Google Merchant Center and Amazon Marketplace.
  • Optimized client performance for a smooth, frictionless user experience, achieving a score of 92/100 on Google PageSpeed Insights.
  • Implemented server-side rendering for the Angular client, generating and serving the sitemap. Enabled Googlebot to crawl 10+ million pages.
  • Implemented a search engine and indexing data pipeline using Elasticsearch and reactive programming. Improved search accuracy and latency by 20%.
Technologies: Scala, Spring, Kubernetes, BigQuery, Google Data Studio, Google Cloud, Angular, Elasticsearch, Node.js, GraphQL, RabbitMQ, HTML, CSS, Grafana, Prometheus, TypeScript, API Design, API Development, Docker, MySQL, Google Cloud SQL, Google Cloud Storage, REST, RxJS, RxJava, Data Pipelines, Google Cloud Platform (GCP), Continuous Integration (CI), Git, JavaScript, REST APIs, JVM, IntelliJ IDEA, Data Science, Java, Databases, Networking, Spring Boot, Microservices Architecture, Serverless, Event-driven Architecture, Reactive Streams, Identity & Access Management (IAM), Terraform, Kustomize, Argo CD, TeamCity, Domain-driven Design (DDD), Software Architecture, Helm, Confluence, Cloud Architecture, Google BigQuery, Data Processing, ETL, Continuous Deployment, Google Kubernetes Engine (GKE), System Design, Cloud Native, Big Data, Data Modeling, Test-driven Development (TDD), SaaS, SQL, Python, Relational Databases

Co-founder

2017 - 2017
Dirty Paws NGO
  • Created a web app (front and back end) where users can report different animals (stray, lost, or missing) to get help from rescuers.
  • Implemented a notification feature for the nearby rescuers using geolocation.
  • Executed a feature-complete authentication system.
  • Applied table listing and map listing for the reported animals.
Technologies: Node.js, Angular, REST APIs, Git, HTML, CSS, JavaScript, Docker, Databases, Networking, Microservices Architecture, API Design, API Development, Domain-driven Design (DDD), Software Architecture, Bitbucket, System Design, Data Modeling, SQL

Tech Lead

2016 - 2017
Egyptian Ministry Of Defence
  • Led the design and implementation of a CMS with real-time document administration features and a notification system.
  • Oversaw the integration with the existing legacy system. Minimized operating cost and processing time.
  • Deployed and operated different virtual machines, web applications, databases, file servers, and data back-ups.
Technologies: Ruby on Rails 5, REST APIs, Git, HTML, CSS, JavaScript, JVM, Node.js, Docker, Java, Databases, Networking, Microservices Architecture, API Design, API Development, Domain-driven Design (DDD), Software Architecture, System Design, Data Modeling, SQL, Relational Databases

Cloud Native Multitenant Data Warehouse

Served as the lead cloud architect, lead security expert, and lead back-end developer working on the system design, architecture, and integration of Google BigQuery to build all the company's market offerings in the eCommerce industry.

• Aligned with top management and product owners on the requirements and use cases for the data warehouse and the high-level architecture. Led the research on the adoption of data mesh architecture.
• Prepared all the ADRs, design, and architecture sketches to choose the best fit solution given the different trade-offs.
• Implemented the back-end services and data pipelines that continuously load the data into the warehouse using BigQuery APIs, Kubernetes, Scala, and Cloudflow.
• Created and implemented the solution for tenant isolation, securing the data and computation of the different tenants along with costs and blast-radius isolation.
• Designed the organization and structure of the Google Cloud projects, cloud resources and cloud products to use, and I reviewed the solution with Google engineers.
• Implemented the automation, CI, and CD for provisioning all the needed cloud resources and components supporting the overall architecture using Kubernetes, operators, and ArgoCD.

Real-time Data Processing Pipeline

Served as the lead architect, lead back-end developer, and lead DevOps specialist on a Kafka Streams eCommerce application for joining and aggregating data to deliver rich data in real-time to be used by production systems.

• I started by deep diving into Kafka Streams, reading a 250-page book and many other resources that helped me understand the ecosystem and become an expert in data streaming.
• Implemented three POCs to analyze and guide the adoption in production.
• Given the previous research, I designed the final architecture, flow of data, the different components needed, and blast-radius isolation.
• I implemented the Kafka streams pipeline using the scala DSL and deployed it to Kubernetes with CI/CD. The pipeline processes 20 GB of data daily from multiple datasets performing different aggregations and joins, and delivers the result to Google Cloud Storage and BigQuery.

The pipeline reduced costs by 77% and improved throughout by 350% compared to the previous approach, so it was a very successful project. As a result, I was promoted to senior software engineer.

Comparison Shopping Service

http://smec.shopping
Worked as the full-stack developer, involved in all development features for a SaaS application using Spring, Spring Boot, and Scala for the back end and Angular for the front end.

It should be noted that the main clustering algorithm displays grouped products to the end user. This also involved implementing a search engine.
I also developed the Angular web client and did extensive work on SEO, including server-side rendering.

I was involved in planning, designing the different epics, and setting the direction for the project.

Languages

Scala, GraphQL, Java, TypeScript, JavaScript, HTML, CSS, SQL, Python

Frameworks

Spring, Angular, Spring Boot, Ruby on Rails 5, Spark, Hadoop

Libraries/APIs

Akka Streams, API Development, RxJS, RxJava, REST APIs, Node.js, Spark ML

Tools

IntelliJ IDEA, Kafka Streams, BigQuery, Git, Grafana, Cloudflow, Bitbucket, Google Kubernetes Engine (GKE), ScalaTest, RabbitMQ, Terraform, TeamCity, Helm, Confluence, Amazon EKS, Google Cloud Dataproc, Cloud Dataflow, Apache Beam, AutoML, Apache Airflow, Google Cloud Composer

Paradigms

Microservices Architecture, Continuous Integration (CI), REST, Event-driven Architecture, Continuous Deployment, ETL, Test-driven Development (TDD), Real-time Systems, Data Science

Platforms

JVM, Kubernetes, Docker, Apache Kafka, Google Cloud Platform (GCP), Cloud Native, Amazon Web Services (AWS), Amazon EC2, AWS Lambda, Vertex AI

Storage

Google Cloud, Databases, Data Pipelines, Google Cloud Storage, Relational Databases, MariaDB, Elasticsearch, MySQL, Google Cloud SQL, Alibaba Cloud, HBase, Cloud Firestore, HDFS, BigTable, Google Cloud Spanner

Other

Google Data Studio, Networking, API Design, Prometheus, Google Cloud Functions, Serverless, Reactive Streams, Identity & Access Management (IAM), Kustomize, Argo CD, Domain-driven Design (DDD), Software Architecture, Cloud Architecture, Google BigQuery, GSM, Data Modeling, Big Data, System Design, Data Processing, SaaS, Cloud, Stream Processing, Data Warehouse Design, Data Warehousing, Data Products, Data Mesh, Google Pub/Sub, Fintech

2017 - 2018

Master's Degree in Data Science

Johannes Kepler University - Linz, Austria

2011 - 2015

Bachelor's Degree in Computer Science Engineering

German University in Cairo - Cairo, Egypt

DECEMBER 2022 - FEBRUARY 2025

Google Cloud Certified - Professional Data Engineer

Google

DECEMBER 2022 - DECEMBER 2025

AWS Certified Developer

AWS

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring