Mohamed Abulazm, Data Engineer and ETL Developer in Linz, Austria
Mohamed Abulazm

Data Engineer and ETL Developer in Linz, Austria

Member since August 23, 2021
Mohamed has a master's degree in data science. He worked as a data engineer and full-stack developer for over seven years. Building cloud-native big data applications are his strongest suit. Mohamed served as the TL on multiple projects in the eCommerce industry, including delivering a Google partner SaaS app used by more than 400,000 monthly users. He showed technical and organizational excellence as TL building military software. He enjoys mentoring developers and presenting technology talks.
Mohamed is now available for hire

Portfolio

  • KM.ON
    Apache Kafka, Helm, Docker, Bitbucket, Continuous Integration (CI)...
  • Smarter Ecommerce GmbH
    Akka Streams, Apache Kafka, BigQuery, Kafka Streams, Kubernetes...
  • Smarter Ecommerce GmbH
    Scala, Spring, Kubernetes, BigQuery, Google Data Studio, Google Cloud...

Experience

Location

Linz, Austria

Availability

Part-time

Preferred Environment

JVM, Node.js, Scala, Kubernetes, Google Cloud Platform (GCP), Apache Kafka, ETL, Identity & Access Management (IAM), Software Architecture, Domain-driven Design (DDD)

The most amazing...

...work I've done is designing and implementing a multi-TB data warehouse on GCP leveraging data meshes architecture with E2E security to isolate the 450+ tenants.

Employment

  • Kafka System Engineer

    2021 - 2022
    KM.ON
    • Industrialized the installation and DevOps of the Kafka cluster on AWS EKS and Alibaba Cloud ACK. Benchmarked and improved throughput, reliability, and fault tolerance in preparation for the production rollout.
    • Learned to deploy and operate a multi-cloud architecture set up on AWS and Alibaba Cloud and to deliver and deploy software in China to serve Chinese clients.
    • Aligned with the engineering lead and team on the best practices for managing Kafka clusters at scale.
    • Analyzed and implemented GitOps POCs using different technologies, such as ArgoCD or Flux.
    Technologies: Apache Kafka, Helm, Docker, Bitbucket, Continuous Integration (CI), Alibaba Cloud, Kubernetes, Git, JVM, Google Docs, Visual Studio Code, Monitoring, Software Design Patterns, Programming, Java, Databases, Networking, Microservices Architecture, API Design, Grafana, Prometheus, Event-driven Architecture, Identity & Access Management (IAM), Kustomize, Software Architecture, Confluence, Cloud Architecture, Realtime, Continuous Deployment, System Design, Cloud Native, Big Data, Data Modeling, SaaS, Slack, Amazon EKS, Amazon Web Services (AWS)
  • Data Engineer

    2020 - 2022
    Smarter Ecommerce GmbH
    • Spearheaded the integration of Kafka streams used for different data processing pipelines. Reduced running costs by 70% and improved throughput by 350%.
    • Worked on a generic cloud-native big data platform and data pipeline framework on top of Kafka, Akka Streams, Kubernetes, and Scala to ingest and transform hundreds of datasets from various sources.
    • Led the design, implementation, and integration of the data warehouse Google Cloud BigQuery. Defined and used idiomatic cloud patterns for reading/writing data and building ETL/data pipelines in the data platform.
    • Managed the design and adoption of data mesh architecture and data governance. Oversaw the delivery of four data products and their data pipelines.
    • Oversaw the design and implementation of securing and isolating the 450+ tenants on the compute, storage, and billing levels in Google Cloud.
    • Deployed, integrated, and operated Kafka to handle the communication between the different data pipelines. Designed all the operations processes, disaster recovery, and automation for the cluster deployed on Kubernetes.
    • Spearheaded the definition and implementation of the domain data model. Aligned the requirements and constraints of at least five teams that use the storage-agnostic model in many data pipelines.
    • Led the Google Cloud infrastructure automation setup handling cloud projects, Kubernetes clusters, cloud functions, storage (GCS, BigQuery, Kafka), and different security topics like Identity and Access Management (IAM) and workload identity.
    Technologies: Akka Streams, Apache Kafka, BigQuery, Kafka Streams, Kubernetes, Google Cloud Functions, Serverless, Event-driven Architecture, Reactive Streams, Data Warehouse Design, Data Wrangling, Data Warehousing, Scala, Identity & Access Management (IAM), Terraform, Kustomize, Google Cloud Storage, Prometheus, Grafana, Cloudflow, Google Cloud Platform (GCP), Argo CD, TeamCity, Domain-driven Design (DDD), Software Architecture, Data Pipelines, Continuous Integration (CI), Git, REST APIs, JVM, IntelliJ, Google Cloud, Google Docs, Docker, Visual Studio Code, Data Science, Google Data Studio, Monitoring, Software Design Patterns, Programming, Java, Databases, Networking, Microservices Architecture, API Design, API Development, Helm, Confluence, Cloud Architecture, Google BigQuery, Google Secret Manager, Data Mesh, Cost Cutting, Realtime, Data Processing, ETL, ScalaTest, Continuous Deployment, Google Kubernetes Engine (GKE), System Design, Cloud Native, Big Data, Data Modeling, Test-driven Development (TDD), SaaS, SQL, Slack, Python
  • Software Engineer

    2018 - 2020
    Smarter Ecommerce GmbH
    • Worked on a SaaS for comparison shopping service used by more than 400,000 monthly users and part of the Google CSS Program.
    • Led the research, design, and implementation of a clustering algorithm and data pipeline that processes more than 100 million products daily from different data sources like Google Merchant Center and Amazon Marketplace.
    • Optimized client performance for a smooth, frictionless user experience, achieving a score of 92/100 on Google PageSpeed Insights.
    • Implemented server-side rendering for the Angular client, generating and serving the sitemap. Enabled Googlebot to crawl 10+ million pages.
    • Implemented a search engine and indexing data pipeline using Elasticsearch and reactive programming. Improved search accuracy and latency by 20%.
    Technologies: Scala, Spring, Kubernetes, BigQuery, Google Data Studio, Google Cloud, Angular, Elasticsearch, Monitoring, Node.js, GraphQL, Software Design Patterns, RabbitMQ, HTML, CSS, SCSS, Grafana, Prometheus, TypeScript, Express.js, Bootstrap, API Design, API Development, Docker, MySQL, Google Cloud SQL, Google Cloud Storage, REST, RxJS, RxJava, Server-side Rendering, Data Pipelines, NgRx, Sass, Google Cloud Platform (GCP), Continuous Integration (CI), Git, JavaScript, REST APIs, JVM, IntelliJ, Google Docs, Visual Studio Code, Data Science, Programming, Java, Databases, Networking, Spring Boot, Microservices Architecture, Serverless, Event-driven Architecture, Reactive Streams, Data Wrangling, Identity & Access Management (IAM), Terraform, Kustomize, Argo CD, TeamCity, Domain-driven Design (DDD), Software Architecture, Helm, Confluence, Cloud Architecture, Google BigQuery, Cost Cutting, Realtime, Data Processing, ETL, Continuous Deployment, Google Kubernetes Engine (GKE), System Design, Cloud Native, Big Data, Data Modeling, Test-driven Development (TDD), SaaS, SQL, Slack, Python
  • Co-founder

    2017 - 2017
    Dirty Paws NGO
    • Created a web app (front and back end) where users can report different animals (stray, lost, or missing) to get help from rescuers.
    • Implemented a notification feature for the nearby rescuers using geolocation.
    • Executed a feature-complete authentication system.
    • Applied table listing and map listing for the reported animals.
    Technologies: Node.js, Express.js, Angular, MongoDB, Bootstrap, Sass, REST APIs, Git, HTML, CSS, SCSS, JavaScript, Docker, Visual Studio Code, Programming, Databases, Networking, Microservices Architecture, API Design, API Development, Domain-driven Design (DDD), Software Architecture, Bitbucket, System Design, Data Modeling, SQL
  • Tech Lead

    2016 - 2017
    Egyptian Ministry Of Defence
    • Led the design and implementation of a CMS with real-time document administration features and a notification system.
    • Oversaw the integration with the existing legacy system. Minimized operating cost and processing time.
    • Deployed and operated different virtual machines, web applications, databases, file servers, and data back-ups.
    Technologies: Ruby on Rails 5, .NET, Bootstrap, Sass, Ruby on Rails (RoR), REST APIs, Git, HTML, CSS, SCSS, JavaScript, JVM, Node.js, Docker, Visual Studio Code, Programming, Java, Databases, Networking, Microservices Architecture, API Design, API Development, Domain-driven Design (DDD), Software Architecture, System Design, Data Modeling, SQL

Experience

  • Cloud Native Multitenant Data Warehouse

    Served as the lead cloud architect, lead security expert, and lead back-end developer working on the system design, architecture, and integration of Google BigQuery to build all the company's market offerings in the eCommerce industry.

    • Aligned with top management and product owners on the requirements and use cases for the data warehouse and the high-level architecture. Led the research on the adoption of data mesh architecture.
    • Prepared all the ADRs, design, and architecture sketches to choose the best fit solution given the different trade-offs.
    • Implemented the back-end services and data pipelines that continuously load the data into the warehouse using BigQuery APIs, Kubernetes, Scala, and Cloudflow.
    • Created and implemented the solution for tenant isolation, securing the data and computation of the different tenants along with costs and blast-radius isolation.
    • Designed the organization and structure of the Google Cloud projects, cloud resources and cloud products to use, and I reviewed the solution with Google engineers.
    • Implemented the automation, CI, and CD for provisioning all the needed cloud resources and components supporting the overall architecture using Kubernetes, operators, and ArgoCD.

  • Real-time Data Processing Pipeline

    Served as the lead architect, lead back-end developer, and lead DevOps specialist on a Kafka Streams eCommerce application for joining and aggregating data to deliver rich data in real-time to be used by production systems.

    • I started by deep diving into Kafka Streams, reading a 250-page book and many other resources that helped me understand the ecosystem and become an expert in data streaming.
    • Implemented three POCs to analyze and guide the adoption in production.
    • Given the previous research, I designed the final architecture, flow of data, the different components needed, and blast-radius isolation.
    • I implemented the Kafka streams pipeline using the scala DSL and deployed it to Kubernetes with CI/CD. The pipeline processes 20 GB of data daily from multiple datasets performing different aggregations and joins, and delivers the result to Google Cloud Storage and BigQuery.

    The pipeline reduced costs by 77% and improved throughout by 350% compared to the previous approach, so it was a very successful project. As a result, I was promoted to senior software engineer.

  • Comparison Shopping Service
    http://smec.shopping

    Worked as the full-stack developer, involved in all development features for a SaaS application using Spring, Spring Boot, and Scala for the back end and Angular for the front end.

    It should be noted that the main clustering algorithm displays grouped products to the end user. This also involved implementing a search engine.
    I also developed the Angular web client and did extensive work on SEO, including server-side rendering.

    I was involved in planning, designing the different epics, and setting the direction for the project.

Skills

  • Languages

    Scala, GraphQL, Java, TypeScript, JavaScript, HTML, CSS, SQL, Python
  • Frameworks

    Spring, Angular, Spring Boot, Ruby on Rails 5
  • Libraries/APIs

    Akka Streams, API Development, RxJS, RxJava, REST APIs, Node.js
  • Tools

    IntelliJ, Kafka Streams, BigQuery, Git, Grafana, Bitbucket, Google Kubernetes Engine (GKE), ScalaTest, RabbitMQ, Terraform, TeamCity, Helm, Confluence, Amazon EKS
  • Paradigms

    Microservices Architecture, Continuous Integration (CI), REST, Event-driven Architecture, Continuous Deployment, ETL, Test-driven Development (TDD), Data Science
  • Platforms

    JVM, Kubernetes, Docker, Apache Kafka, Google Cloud Platform (GCP), Cloud Native, Amazon Web Services (AWS), AWS Lambda
  • Storage

    Google Cloud, Databases, Data Pipelines, Google Cloud Storage, Elasticsearch, MySQL, Google Cloud SQL, Alibaba Cloud
  • Other

    Google Data Studio, Networking, API Design, Prometheus, Google Cloud Functions, Serverless, Reactive Streams, Identity & Access Management (IAM), Kustomize, Cloudflow, Argo CD, Domain-driven Design (DDD), Software Architecture, Cloud Architecture, Google BigQuery, Google Secret Manager, Data Modeling, Big Data, System Design, Data Processing, SaaS, Data Warehouse Design, Data Warehousing, Data Products, Data Mesh, Cloud

Education

  • Master's Degree in Data Science
    2017 - 2018
    Johannes Kepler University - Linz, Austria
  • Bachelor's Degree in Computer Science Engineering
    2011 - 2015
    German University in Cairo - Cairo, Egypt

Certifications

  • AWS Certified Developer
    DECEMBER 2022 - DECEMBER 2025
    AWS

To view more profiles

Join Toptal
Share it with others