Usama Abbas
Verified Expert in Engineering
Data Engineer and Developer
Usama is a results-oriented professional with 5+ years of data and analytics engineering expertise. He has designed and overseen ETL pipelines, tackling architectural and scalability challenges for a diverse global clientele. Usama's skills include proficiency in various tools and technologies such as Python, SQL, Spark, dbt, and AWS.
Portfolio
Experience
Availability
Preferred Environment
Python, SQL, Data Build Tool (dbt), Amazon Web Services (AWS), Data Engineering, Data Modeling, Databricks, Analytics, Data Analysis, PySpark
The most amazing...
...thing I've achieved is reducing the load time of the most frequently used dashboard from 13 seconds to less than a second.
Work Experience
Data Engineer
Cialfo
- Developed and executed an ETL pipeline, utilizing Spark, Apache Airflow, and SQL, to efficiently load and transform data from various databases into a centralized data lake on Amazon S3.
- Improved pipeline efficiency by 80% by optimizing the ELT to ETL pipeline using Amazon EMR and Spark.
- Collaborated with machine learning engineers to craft feature extraction pipelines for diverse ML models using SQL, Amazon Athena, and Airflow.
- Took the initiative to implement pipeline monitoring through Slack, ensuring real-time notifications for the team.
- Migrated feature extraction pipelines successfully from Athena to Snowflake.
- Designed and developed scalable APIs using FastAPI for various ML models, utilizing Python, Docker, and Redis.
- Conducted data analysis and generated insightful reports for various stakeholders using Snowflake.
Data Engineer
10Pearls
- Coordinated streamlined data pipelines, seamlessly ingesting data from varied sources such as Google Ads, Microsoft Ads, LinkedIn, Mixpanel, and Salesforce into Databricks using PySpark and Delta Lake.
- Produced insightful reports and dashboards for a range of business stakeholders using Databricks.
- Improved the data migration process from Amazon DynamoDB to Databricks (full to incremental) by integrating data updates captured through Amazon Kinesis Data Firehose.
- Implemented automated infrastructure creation using AWS CDK for enhanced efficiency and streamlined processes.
Data Engineer
NorthBay Solutions
- Leveraged reverse engineering and web scraping methodologies to extract semi-structured datasets (CSV, JSON, XML) from diverse real estate platforms. Transformed the acquired data into a centralized data lake on Amazon S3 using Python and AWS Glue.
- Ensured data accuracy and reliability by validating datasets using Great Expectations on AWS Lambda.
- Contributed significantly to data analysis and analytics across numerous projects, promoting productive stakeholder collaboration.
- Enhanced the performance of the Amazon Redshift data warehouse by identifying and addressing the lack of best practices.
- Collaborated with DevOps engineers to set up the infrastructure for a data lake project using Terraform.
Experience
Data Lake Using Real Estate Datasets
My role was to extract data from two data sources using AWS Glue and maintain their metadata in the MySQL database, convert required data into Parquet, and query that using Athena. I also validated the data against a well-defined configuration file.
Skills
Languages
Python, SQL, Snowflake
Storage
PostgreSQL, Databases, MySQL, Amazon Aurora, Data Lakes, Redshift, Amazon DynamoDB, Amazon S3 (AWS S3), Redis
Other
Data Engineering, Data Analytics, Amazon RDS, Data Build Tool (dbt), APIs, Data Modeling, Analytics, Data Analysis, Software Engineering, Computer Science, Cloud Computing, Machine Learning, Natural Language Processing (NLP), Data Warehousing, Big Data, Big Data Architecture, Query Optimization
Libraries/APIs
PySpark
Platforms
Amazon Web Services (AWS), Databricks, Docker
Frameworks
Spark
Tools
Amazon Elastic MapReduce (EMR), AWS Glue, Apache Airflow, Amazon Athena
Paradigms
ETL
Education
Master's Degree in Computer Science
Punjab University College of Information Technology - Lahore, Punjab, Pakistan
Bachelor's Degree in Software Engineering
Punjab University College of Information Technology - Lahore, Punjab, Pakistan
Certifications
Hands On Essentials - Data Lake
Snowflake
Hands On Essentials - Data Sharing
Snowflake
Hands On Essentials - Data Applications
Snowflake
Hands On Essentials - Data Warehouse
Snowflake
dbt Fundamentals
dbt Labs
AWS Certified Developer – Associate
Amazon Web Services
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring