
Muhib Ullah Khan
Verified Expert in Engineering
Data Engineering Consultant and Developer
Lahore, Punjab, Pakistan
Toptal member since January 5, 2022
A data engineering consultant with over ten years of experience, Muhib specializes in relational database management systems and cloud-based data engineering services. His core expertise is data modeling, data warehousing, ETL, reporting, and data analytics. His recent experience involves the implementation of data pipelines and modern data warehousing solutions using Azure and AWS cloud services. Muhib also holds multiple certifications, including Microsoft Certified Data Engineer.
Portfolio
Experience
- Microsoft SQL Server - 8 years
- SQL - 8 years
- ETL - 5 years
- Apache Spark - 4 years
- Python - 4 years
- Databricks - 3 years
- Azure - 3 years
- Microsoft Power BI - 2 years
Availability
Preferred Environment
SQL, Python, Azure, Git, Amazon Web Services (AWS), RDBMS, Data Pipelines, Apache Spark, Databricks, Snowflake
The most amazing...
...thing I've developed is a data mart and Power BI report for a food delivery service that helped the company significantly improve the time and cost of delivery.
Work Experience
Data Engineer via Toptal
HelloFresh USA
- Developed a centralized feature store and ETL pipelines to serve quality machine learning features for data science models across the company, reducing repetitive feature engineering work for every model.
- Collaborated with multiple teams and helped them onboard to the feature store and ML Ops platform, enabling them to produce and generate valuable predictions for the company quickly.
- Developed a framework to improve data quality for critical data assets, which further enhanced the performance of ML models, resulting in more accurate predictions.
Lead Data Engineer
Digifloat
- Led a product development team working on improving the time and cost of operationalizing raw and scattered data to be readily available for further analysis and reporting.
- Architected data pipelines for clients using Apache Airflow, Apache Spark, Azure Databricks, ADLs, Azure Synapse Analytics, and Snowflake. Helped them overcome their warehousing challenges.
- Designed a warehousing solution for automobile data from different sources and defined standardized fact and dimension tables. The warehouse data would be consumed by Power BI to generate reports based on clients' requirements.
- Provided any services clients needed as a data consultant.
Data Engineering Consultant
Contour Software
- Worked as a consultant to fix bugs and optimize reporting queries that increased the reporting efficiency by 50 percent.
- Ingested and integrated data from a legacy application into the company's SQL Server database and performed transformations to meet customers' requirements.
- Migrated millions of data files from AWS S3 to a remote Microsoft Windows Server using S3 CLI scripts in PowerShell to authenticate on S3, download each file using the metadata table, and store files on the target location.
- Implemented an OCR system using Azure Cognitive Search that automated the whole data extraction process from PDF reports that was previously done manually.
Senior Software Engineer | Database Developer
Strategic Systems International
- Integrated more than five data sources with the data warehouse, thus enabling clients to analyze their historical data and make future business strategies.
- Implemented a data pipeline using Azure Event Hubs, Azure Databricks, PySpark, and SQL to ingest and transform real-time factory data that powered a mobile app displaying real-time dashboards.
- Introduced automated ETL process using SSIS and SQL jobs that saved developers time on repetitive tasks.
- Migrated SQL Server databases to a big data platform consisting of HDFS and Apache Hive to perform data analysis. Used Apache Kylin to build cubes on top of Hive tables resulting in a faster query response of up to ten times.
- Provided database support on four different projects at a time to help other teams meet their deadlines.
Database Developer
Zin Technologies
- Worked in a team to fix bugs and participated in more than 15 production releases that made the production environment more reliable and increased the amount of revenue generation.
- Optimized poorly written stored procedures causing timeout during bulk data processing and peak hours. This work reduced the number of customer complaints and increased the overall performance.
- Developed SSIS package to extract real-time data of connected GSM devices from multiple servers and load them into a centralized billing engine which proved to be an efficient way of data migration.
- Monitored server health and various performance metrics using Nagios and SQL Server jobs as a database administrator.
Experience
Analytics for a Car Manufacturing Brand
Pocket Factory App for a Soda Manufacturing Company
Data Conversion for a US-based Company
Embedded SIM (eSIM) Project
Power BI Visualizations for Food Delivery Service
Certifications
Azure Certified Data Analyst
Microsoft
Azure Certified AI Engineer
Microsoft
Microsoft Azure Data Engineer Associate
Microsoft
Skills
Libraries/APIs
PySpark
Tools
Git, Microsoft Power BI, Nagios, Apache Airflow, Microsoft Teams
Languages
SQL, Python, Snowflake
Storage
Relational Databases, Microsoft SQL Server, Data Pipelines, Azure SQL, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), RDBMS
Frameworks
Apache Spark
Paradigms
ETL, DevOps
Platforms
Azure, Amazon Web Services (AWS), Databricks, Azure Synapse
Other
Data Engineering, Azure Databricks, Artificial Intelligence (AI), Data Warehousing, Azure Data Lake, Azure Data Studio, Data Warehouse Design, Data Analysis, Cloud Infrastructure, Machine Learning
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring