Usama Abbas, Developer in Lahore, Punjab, Pakistan
Usama is available for hire
Hire Usama

Usama Abbas

Verified Expert  in Engineering

Data Engineer and Developer

Location
Lahore, Punjab, Pakistan
Toptal Member Since
January 30, 2024

Usama is a results-oriented professional with 5+ years of data and analytics engineering expertise. He has designed and overseen ETL pipelines, tackling architectural and scalability challenges for a diverse global clientele. Usama's skills include proficiency in various tools and technologies such as Python, SQL, Spark, dbt, and AWS.

Portfolio

Cialfo
Amazon Web Services (AWS), Python, Apache Airflow, Data Build Tool (dbt), Spark...
10Pearls
Data Engineering, Amazon Web Services (AWS), Databricks, Data Analytics...
NorthBay Solutions
Python, SQL, Amazon Web Services (AWS), PySpark, Spark...

Experience

Availability

Full-time

Preferred Environment

Python, SQL, Data Build Tool (dbt), Amazon Web Services (AWS), Data Engineering, Data Modeling, Databricks, Analytics, Data Analysis, PySpark

The most amazing...

...thing I've achieved is reducing the load time of the most frequently used dashboard from 13 seconds to less than a second.

Work Experience

Data Engineer

2021 - 2022
Cialfo
  • Developed and executed an ETL pipeline, utilizing Spark, Apache Airflow, and SQL, to efficiently load and transform data from various databases into a centralized data lake on Amazon S3.
  • Improved pipeline efficiency by 80% by optimizing the ELT to ETL pipeline using Amazon EMR and Spark.
  • Collaborated with machine learning engineers to craft feature extraction pipelines for diverse ML models using SQL, Amazon Athena, and Airflow.
  • Took the initiative to implement pipeline monitoring through Slack, ensuring real-time notifications for the team.
  • Migrated feature extraction pipelines successfully from Athena to Snowflake.
  • Designed and developed scalable APIs using FastAPI for various ML models, utilizing Python, Docker, and Redis.
  • Conducted data analysis and generated insightful reports for various stakeholders using Snowflake.
Technologies: Amazon Web Services (AWS), Python, Apache Airflow, Data Build Tool (dbt), Spark, SQL, ETL, Data Lakes, Amazon S3 (AWS S3), Query Optimization, Amazon Elastic MapReduce (EMR), Redis, Amazon Athena, Snowflake, Docker, Data Analysis, Data Analytics, Big Data, Amazon RDS, PostgreSQL, MySQL, Databases, APIs

Data Engineer

2020 - 2021
10Pearls
  • Coordinated streamlined data pipelines, seamlessly ingesting data from varied sources such as Google Ads, Microsoft Ads, LinkedIn, Mixpanel, and Salesforce into Databricks using PySpark and Delta Lake.
  • Produced insightful reports and dashboards for a range of business stakeholders using Databricks.
  • Improved the data migration process from Amazon DynamoDB to Databricks (full to incremental) by integrating data updates captured through Amazon Kinesis Data Firehose.
  • Implemented automated infrastructure creation using AWS CDK for enhanced efficiency and streamlined processes.
Technologies: Data Engineering, Amazon Web Services (AWS), Databricks, Data Analytics, Data Analysis, Big Data Architecture, ETL, Amazon DynamoDB, Data Modeling, Big Data, Amazon RDS, Databases, APIs

Data Engineer

2018 - 2020
NorthBay Solutions
  • Leveraged reverse engineering and web scraping methodologies to extract semi-structured datasets (CSV, JSON, XML) from diverse real estate platforms. Transformed the acquired data into a centralized data lake on Amazon S3 using Python and AWS Glue.
  • Ensured data accuracy and reliability by validating datasets using Great Expectations on AWS Lambda.
  • Contributed significantly to data analysis and analytics across numerous projects, promoting productive stakeholder collaboration.
  • Enhanced the performance of the Amazon Redshift data warehouse by identifying and addressing the lack of best practices.
  • Collaborated with DevOps engineers to set up the infrastructure for a data lake project using Terraform.
Technologies: Python, SQL, Amazon Web Services (AWS), PySpark, Spark, Amazon Elastic MapReduce (EMR), Analytics, AWS Glue, Redshift, Big Data, ETL, Data Modeling, Data Analytics, Amazon Aurora, Amazon RDS, PostgreSQL, MySQL, Databases, APIs

Data Lake Using Real Estate Datasets

A data lake project in which we extracted commercial real estate data from five different data sources using APIs and web scraping. Then, we transformed that into Parquet format so it can be queried using Amazon Athena.

My role was to extract data from two data sources using AWS Glue and maintain their metadata in the MySQL database, convert required data into Parquet, and query that using Athena. I also validated the data against a well-defined configuration file.

Languages

Python, SQL, Snowflake

Storage

PostgreSQL, Databases, MySQL, Amazon Aurora, Data Lakes, Redshift, Amazon DynamoDB, Amazon S3 (AWS S3), Redis

Other

Data Engineering, Data Analytics, Amazon RDS, Data Build Tool (dbt), APIs, Data Modeling, Analytics, Data Analysis, Software Engineering, Computer Science, Cloud Computing, Machine Learning, Natural Language Processing (NLP), Data Warehousing, Big Data, Big Data Architecture, Query Optimization

Libraries/APIs

PySpark

Platforms

Amazon Web Services (AWS), Databricks, Docker

Frameworks

Spark

Tools

Amazon Elastic MapReduce (EMR), AWS Glue, Apache Airflow, Amazon Athena

Paradigms

ETL

2019 - 2023

Master's Degree in Computer Science

Punjab University College of Information Technology - Lahore, Punjab, Pakistan

2014 - 2018

Bachelor's Degree in Software Engineering

Punjab University College of Information Technology - Lahore, Punjab, Pakistan

APRIL 2023 - PRESENT

Hands On Essentials - Data Lake

Snowflake

APRIL 2023 - PRESENT

Hands On Essentials - Data Sharing

Snowflake

APRIL 2023 - PRESENT

Hands On Essentials - Data Applications

Snowflake

APRIL 2023 - PRESENT

Hands On Essentials - Data Warehouse

Snowflake

JANUARY 2023 - JANUARY 2025

dbt Fundamentals

dbt Labs

OCTOBER 2018 - OCTOBER 2021

AWS Certified Developer – Associate

Amazon Web Services

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring