Abdul is available for hire

Abdul Samad

Verified Expert in Engineering

Data Engineer and Developer

Location

Lahore, Pakistan

Toptal Member Since

October 4, 2022

Abdul is a seasoned software and data engineer who has helped Fortune 500 companies harness the power of their data by building robust pipelines for ingesting and processing vast amounts of data. Skilled in several technologies, programming languages, and frameworks, he is proficient in using OSS and Google Cloud services. Abdul believes that no matter the company, data is indispensable to business, and he is committed to enhancing its power and preserving its integrity.

Apache Kafka Apache Spark Data Engineering Big Data SQL Java ETL Spark Scala PySpark Python Kafka Streams Apache Airflow Apache Hive Google Cloud Platform (GCP)Flume Apache Superset

Portfolio

Confiz

Java, Scala, Apache Spark, Apache Storm, Apache Kafka, Kafka Streams...

Pace LLC

Python, Scala, Google Cloud Platform (GCP), Pandas, Apache Beam, Apache Spark...

Experience

Java - 7 years Apache Kafka - 7 years Apache Spark - 6 years Apache Storm - 6 years Google BigQuery - 5 years Looker - 4 years Python - 3 years Apache Airflow - 2 years

Availability

Part-time

Preferred Environment

Linux

The most amazing...

...project I've worked on is a disaster recovery mechanism for the on-premise clusters that deploy the app to another data center in case of a data-center crash.

Work Experience

Principal Data Engineer

2015 - PRESENT

Confiz

Developed data pipelines for ingestion and processing using technologies like Storm, Spark, and Flume, which read the data from Apache Kafka and write it to Hadoop (HDFS), Cosmos Db, and BigQuery.
Created and managed on-premise clusters for Storm and Flume.
Implemented auto-scaling for on-premise Storm clusters.
Used BigQuery and Looker to build analytical dashboards and reports, which are then used by business personnel and other project members.
Monitored the health of deployed data pipelines using Spring Boot scheduler.
Set up and maintained CI/CD pipelines using Jenkins and Concord for seamless deployment of our data pipelines.
Reduced the cost of Storm clusters to half by using low-end machines without compromising the performance.
Built a disaster recovery mechanism using Kafka's Active-Active configuration to avoid downtime in case any data center goes down.
Implemented data observability around the real-time pipeline.

Technologies: Java, Scala, Apache Spark, Apache Storm, Apache Kafka, Kafka Streams, Apache Flume, Apache Hive, Google Cloud Platform (GCP), Hadoop, Google BigQuery, Google Cloud Dataproc, Spring, MySQL, MariaDB, Azure Cosmos DB, Kubernetes, Looker, Tableau, Jenkins, Splunk, Cloud Storage, Spark, ETL, Data Engineering, Big Data, SQL

Data Engineer

2019 - 2020

Pace LLC

Gathered delivery rider's data and processed it through our data pipelines into our data lake on BigQuery.
Built an analytics platform for business teams using tools like Python, Pandas, and Apache Superset on the Google Cloud Platform.
Created a mechanism to incentivize riders based on their deliveries.

Technologies: Python, Scala, Google Cloud Platform (GCP), Pandas, Apache Beam, Apache Spark, PostgreSQL, Apache Superset, Spark, Big Data, SQL

Experience

Upfront | A Project for Walmart

Upfront is a mobile application deployed in Walmart's stores across the US. It helps staff monitor the self-checkout registers at the stores and take quick action if anything goes wrong or any support is required. This project strengthens Walmart's store associates and other personnel working on different POS systems to execute daily mundane tasks more efficiently.

PACE

PACE is a logistics application deployed in KSA that allows delivery riders to pick up items from a source and then deliver them to a destination. We gathered the rider's data and processed it through our data pipelines, later used by business personnel for rider tracking, rider incentives, and other business use cases.

Education

2011 - 2015

Bachelor's Degree in Computer Engineering

National University of Computer and Emerging Sciences - Lahore, Pakistan

Skills

Libraries/APIs

Pandas, PySpark

Tools

Apache Storm, Looker, Kafka Streams, Google Cloud Dataproc, Tableau, Jenkins, Splunk, Apache Beam, Apache Airflow

Frameworks

Apache Spark, Spark, Hadoop, Spring

Languages

Java, SQL, Scala, Python

Platforms

Apache Kafka, Google Cloud Platform (GCP), Kubernetes, Jupyter Notebook, Docker, Linux

Paradigms

ETL

Storage

Apache Hive, MySQL, MariaDB, Azure Cosmos DB, PostgreSQL, Cloud Firestore

Other

Google BigQuery, Data Engineering, Big Data, Computer Engineering, Apache Flume, Cloud Storage, Apache Superset

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring