Muhib Ullah Khan, Developer in Lahore, Punjab, Pakistan
Muhib is available for hire
Hire Muhib

Muhib Ullah Khan

Verified Expert  in Engineering

Data Engineering Consultant and Developer

Lahore, Punjab, Pakistan

Toptal member since January 5, 2022

Bio

A data engineering consultant with over ten years of experience, Muhib specializes in relational database management systems and cloud-based data engineering services. His core expertise is data modeling, data warehousing, ETL, reporting, and data analytics. His recent experience involves the implementation of data pipelines and modern data warehousing solutions using Azure and AWS cloud services. Muhib also holds multiple certifications, including Microsoft Certified Data Engineer.

Portfolio

HelloFresh USA
SQL, Python, Apache Spark, ETL, Amazon Web Services (AWS), PySpark...
Digifloat
Azure, Apache Airflow, Azure Data Lake, Databricks, Snowflake, Azure Synapse...
Contour Software
SQL, Microsoft SQL Server, Azure, ETL, Azure SQL, Python, Azure Databricks...

Experience

  • Microsoft SQL Server - 8 years
  • SQL - 8 years
  • ETL - 5 years
  • Apache Spark - 4 years
  • Python - 4 years
  • Databricks - 3 years
  • Azure - 3 years
  • Microsoft Power BI - 2 years

Availability

Part-time

Preferred Environment

SQL, Python, Azure, Git, Amazon Web Services (AWS), RDBMS, Data Pipelines, Apache Spark, Databricks, Snowflake

The most amazing...

...thing I've developed is a data mart and Power BI report for a food delivery service that helped the company significantly improve the time and cost of delivery.

Work Experience

Data Engineer via Toptal

2022 - 2024
HelloFresh USA
  • Developed a centralized feature store and ETL pipelines to serve quality machine learning features for data science models across the company, reducing repetitive feature engineering work for every model.
  • Collaborated with multiple teams and helped them onboard to the feature store and ML Ops platform, enabling them to produce and generate valuable predictions for the company quickly.
  • Developed a framework to improve data quality for critical data assets, which further enhanced the performance of ML models, resulting in more accurate predictions.
Technologies: SQL, Python, Apache Spark, ETL, Amazon Web Services (AWS), PySpark, Apache Airflow, DevOps, Databricks, Data Pipelines, Cloud Infrastructure, Data Warehousing, Machine Learning, Artificial Intelligence (AI), Azure Databricks, Data Engineering

Lead Data Engineer

2021 - 2022
Digifloat
  • Led a product development team working on improving the time and cost of operationalizing raw and scattered data to be readily available for further analysis and reporting.
  • Architected data pipelines for clients using Apache Airflow, Apache Spark, Azure Databricks, ADLs, Azure Synapse Analytics, and Snowflake. Helped them overcome their warehousing challenges.
  • Designed a warehousing solution for automobile data from different sources and defined standardized fact and dimension tables. The warehouse data would be consumed by Power BI to generate reports based on clients' requirements.
  • Provided any services clients needed as a data consultant.
Technologies: Azure, Apache Airflow, Azure Data Lake, Databricks, Snowflake, Azure Synapse, SQL, Python, PySpark, Apache Spark, ETL, Azure Databricks, Data Engineering

Data Engineering Consultant

2019 - 2021
Contour Software
  • Worked as a consultant to fix bugs and optimize reporting queries that increased the reporting efficiency by 50 percent.
  • Ingested and integrated data from a legacy application into the company's SQL Server database and performed transformations to meet customers' requirements.
  • Migrated millions of data files from AWS S3 to a remote Microsoft Windows Server using S3 CLI scripts in PowerShell to authenticate on S3, download each file using the metadata table, and store files on the target location.
  • Implemented an OCR system using Azure Cognitive Search that automated the whole data extraction process from PDF reports that was previously done manually.
Technologies: SQL, Microsoft SQL Server, Azure, ETL, Azure SQL, Python, Azure Databricks, Data Engineering

Senior Software Engineer | Database Developer

2016 - 2019
Strategic Systems International
  • Integrated more than five data sources with the data warehouse, thus enabling clients to analyze their historical data and make future business strategies.
  • Implemented a data pipeline using Azure Event Hubs, Azure Databricks, PySpark, and SQL to ingest and transform real-time factory data that powered a mobile app displaying real-time dashboards.
  • Introduced automated ETL process using SSIS and SQL jobs that saved developers time on repetitive tasks.
  • Migrated SQL Server databases to a big data platform consisting of HDFS and Apache Hive to perform data analysis. Used Apache Kylin to build cubes on top of Hive tables resulting in a faster query response of up to ten times.
  • Provided database support on four different projects at a time to help other teams meet their deadlines.
Technologies: SQL, Microsoft SQL Server, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), Data Warehousing, Amazon Web Services (AWS), Python, Data Pipelines, PySpark, Azure, ETL, Databricks, Apache Spark

Database Developer

2013 - 2016
Zin Technologies
  • Worked in a team to fix bugs and participated in more than 15 production releases that made the production environment more reliable and increased the amount of revenue generation.
  • Optimized poorly written stored procedures causing timeout during bulk data processing and peak hours. This work reduced the number of customer complaints and increased the overall performance.
  • Developed SSIS package to extract real-time data of connected GSM devices from multiple servers and load them into a centralized billing engine which proved to be an efficient way of data migration.
  • Monitored server health and various performance metrics using Nagios and SQL Server jobs as a database administrator.
Technologies: SQL, Microsoft SQL Server, SQL Server Integration Services (SSIS), Nagios, ETL

Experience

Analytics for a Car Manufacturing Brand

This data analytics project reports on the performance of a famous car manufacturing brand and how it is competing with other brands in different European countries. In this project, I implemented the data ingestion process from landing to a raw zone, the raw zone to ODS, and finally from ODS to the data warehouse. I used Snowflake as the target data warehouse, DBT to write transformation scripts, and Azure services such as ADLS, Spark, or Databricks to implement data ingestion and data lake storage.

Pocket Factory App for a Soda Manufacturing Company

This real-time monitoring app refreshes every minute to visually represent the current status of machines working in a factory. I oversaw ingesting real-time data from IoT devices using Azure Event Hub, Spark, or Databricks, loading the data in a SQL database, and performing further transformations and aggregations to prepare the data for visual representation on the dashboard.

Data Conversion for a US-based Company

As a data conversion specialist, I worked on this SQL-based project. I communicated with business analysts to understand business requirements and then map source data to appropriate fields in the target database. Also, I documented data mappings and transformations required for conversion and wrote conversion scripts to extract, transform, and load data into the target database.

Embedded SIM (eSIM) Project

This machine-to-machine (M2M) service for eSIM users, usually companies, monitors and controls their devices connected to a telecommunication network. In this project, I oversaw automating the billing process so that when the real-time usage data would come in from all the connected devices, it could be processed through a series of scheduled SQL jobs and summarized in the form of an invoice. While working on this project, I was also able to identify and fix a bug in the code that was causing the company to lose hundreds of dollars per invoice for large customers.

Power BI Visualizations for Food Delivery Service

This data analytics project for a restaurant visualizes the performance of its food delivery partners and analyzes factors affecting delivery on time. In this project, I oversaw designing a data mart and then loading the restaurant's data in it. I also created a Power BI dashboard for the client to quickly visualize how well the deliveries are going.

Certifications

JULY 2021 - JULY 2022

Azure Certified Data Analyst

Microsoft

MARCH 2021 - MARCH 2023

Azure Certified AI Engineer

Microsoft

AUGUST 2020 - AUGUST 2022

Microsoft Azure Data Engineer Associate

Microsoft

Skills

Libraries/APIs

PySpark

Tools

Git, Microsoft Power BI, Nagios, Apache Airflow, Microsoft Teams

Languages

SQL, Python, Snowflake

Storage

Relational Databases, Microsoft SQL Server, Data Pipelines, Azure SQL, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), RDBMS

Frameworks

Apache Spark

Paradigms

ETL, DevOps

Platforms

Azure, Amazon Web Services (AWS), Databricks, Azure Synapse

Other

Data Engineering, Azure Databricks, Artificial Intelligence (AI), Data Warehousing, Azure Data Lake, Azure Data Studio, Data Warehouse Design, Data Analysis, Cloud Infrastructure, Machine Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring