Syed Muneeb Hussain
Verified Expert in Engineering
SQL and Data Developer
Karachi, Sindh, Pakistan
Toptal member since July 20, 2022
Muneeb is an experienced data engineer proficient in Python and SQL, specializing in big data technologies. He has worked on various cloud platforms such as AliCloud, Azure, and GCP, as well as transformation tools like dbt and ETL tools, including Airflow, Prefect, ADF, Talend, and SSIS. Additionally, he excels in data visualization using Power BI, Grafana, and Apache Superset. Muneeb is well-versed in DataOps and CI/CD pipelining using Docker and GitHub.
Portfolio
Experience
Availability
Preferred Environment
SQL, Python 3, PyCharm, Talend, Grafana, Data Build Tool (dbt), Apache Airflow, Meltano, Docker, Azure
The most amazing...
...project I've developed is a user-friendly ETL platform, democratizing data orchestration for seamless workflows—bridging the gap for non-tech users.
Work Experience
Lead Data Engineer
Dataquartz
- Led the development of an in-house data ingestion product with Python, Flask, DuckDB, PostgreSQL, Grafana for dynamic visualization, and Prefect for ETL workflow management.
- Orchestrated end-to-end ETL pipelines, incorporating audit logging and data integrity checks using Airflow and Prefect.
- Implemented Prometheus and the Node Exporter for robust logging within the application.
- Pioneered bug tracking, resolution, and new feature development in the data model.
- Containerized the entire application using Docker for enhanced scalability and manageability in the data engineering workflow.
Data Engineering Manager
Seeloz
- Worked on the development of a data ingestion product using Prefect, SQL, Python, Flask, DuckDB, and PostgreSQL for the back end, and Grafana for the data visualization.
- Built various ETL projects using SQL, PySpark, Scala, Azure Logic Apps, and more to pull data from multiple ERPs and various source systems.
- Developed Azure Logic Apps to pull data from Microsoft Dynamics 365 data entities. Wrote ETL in Scala and PySpark to load them into the supply chain meta-model.
- Implemented a monitoring framework using PySpark, PostgreSQL, and Grafana to ensure data correctness and integrity.
- Worked on the development and optimizations of various data models and ETL pipelines for fast data processing.
- Monitored daily data pipelines and ETL data load processes to ensure all the required data was loaded correctly in the supply chain data model.
- Developed various dashboards using Grafana and Power BI to gauge important business metrics.
Big Data Engineering and Governance Lead
Daraz | Alibaba Group
- Built and managed a DWH architecture, and wrote automated ETL scripts using HiveQL, HDFS, HBase, Python, and Shell on a cloud platform for data ingestions.
- Developed BI dashboards on Power BI, vShow, and FBI to gauge important metrics related to domains like customer funnel, marketing, and logistics.
- Developed and maintained an enterprise data warehouse and monitored data ingestion pipelines on a daily basis using SQL, Python, Flink, ODPS, and ETL flows.
- Optimized dozens of ETL pipelines and SQL queries for fast data processing to finish the execution in minutes instead of hours.
- Worked closely with the departmental HODs to maintain optimum levels of communication to effectively and efficiently complete projects.
- Managed incoming data analysis requests and efficiently distributed results to support decision strategies.
Technical Consultant
Qordata
- Designed and developed end-to-end data ingestion pipelines to ensure data flow daily.
- Implemented and managed data flow jobs for data modeling solutions relevant to the health and life science industry, using tools like SQL Server Integration Services (SSIS) and Microsoft SQL Server.
- Developed SQL queries, stored procedures, and dynamic SQL and optimized existing complex SQL queries to speed up day-to-day processes.
- Created ad-hoc data reports that clients requested following their requirements.
Data Engineer
Afiniti
- Designed and developed a database architecture and data model for a business flow using Talend Open Studio, SSIS, and MySQL Workbench.
- Performed large-scale data conversions, migrations, and optimization to reduce resource and time costs while maintaining data integrity.
- Wrote SQL stored procedures and Python scripts for data quality checks and ad-hoc analyses.
- Implemented complex data processing jobs, including integrating customer relationship management (CRM) and third-party data into daily processes.
- Established automated emails to have more visibility on the progress of regular data processing tasks.
- Analyzed clients' business processes to propose optimal solutions for data requirements.
Experience
Automated ETL Tool
Our vision materialized in a centralized application boasting an intuitive click-and-drop interface, fostering a seamless user experience. This innovation serves as a dynamic one-stop-shop, empowering a diverse user base, both technical and non-technical, to effortlessly design and deploy ETL pipelines. The architecture embodies industry-leading practices, optimizing for performance, reliability, and data integrity.
Adhering to agile methodologies, we championed iterative improvements and rapid feature deployment. Rigorous testing practices, spanning unit, integration, and end-to-end testing, ensured the product's reliability and stability. Leveraging automated testing frameworks streamlined our testing process, guaranteeing thorough coverage and swift issue identification.
Meltano Custom Extractor
https://github.com/muneebsmh/meltano_custom_extractorHopsworks Feature Store Python Integration
https://github.com/muneebsmh/hopsworks-integrationsPayment Risk Engine | COD Blocking
I first conducted a thorough data analysis to find the impact on the business and moved on to creating data pipelines and a performance dashboard that would gauge the impact of the system on the overall business of Daraz.
Delayed Order Notification System
This project not only enhanced the customer experience but also helped in gauging Daraz's logistics performance and highlighted key metrics that needed to be fixed.
Dashboard Usage Analysis
I created a meta dashboard that would rank the dashboards by tracking the daily, weekly, and monthly active users and their visits. Also, this meta dashboard tracked individual user history on multiple dashboards, i.e., the number of dashboards that a particular user regularly visits, which helped us filter out the executives' dashboards.
Enterprise Data Warehouse
The enterprise data warehouse (EDW) structure caters to all limitations of an enterprise portal along with additional features, such as a standardized model that can fit into different business requirements without any change in architecture. It helped us track historical changes made to clients' performance and provided a holistic view of all clients in a single portal and at any time.
I worked on creating the whole data warehouse from scratch, including developing all data pipelines and dimensional modeling.
Data Pull from Dynamics 365 Using Azure Logic Apps
Education
Master's Degree in Computer Science
National University of Computer and Emerging Sciences - Karachi, Pakistan
Bachelor's Degree in Computer Science
National University of Computer and Emerging Sciences - Karachi, Pakistan
Skills
Libraries/APIs
NumPy, REST API, Pandas, PySpark, Flask-RESTful, SQL
Tools
Salesforce Development, MySQL, Talend ETL, Visual Studio Development, GitHub, Tableau Development, Business Intelligence Development, BigQuery, Apache Airflow, Grafana, Prefect, Spark, Azure Logic Apps, PyCharm, IntelliJ IDEA, Shell Development, GitLab CI/CD, Celery
Languages
SQL, Python, SQL DML, T-SQL, Stored Procedure, XML, HTML, CSS, Python, Scala
Paradigms
ETL, Database Design, Dimensional Modeling, REST, Business Intelligence Development, Agile Development
Storage
MySQL, Database, Data Integration, SQL, Database, RDBMS, SQL Server, SQL, Relational Databases, Database Modeling, MariaDB, Database, NoSQL, SSIS, Hadoop, PostgreSQL, SQL Server, SQL, Azure, HDFS, MongoDB, Google Cloud Development, Alibaba Cloud, Azure Blobs, JSON, Redis
Frameworks
Windows PowerShell, Hadoop, .NET, Flask, Big Data Architecture, Spark
Platforms
Talend, Apache, Azure Design, Azure SQL Data Warehouse, Jupyter Notebook, Docker, Dedicated SQL Pool (formerly SQL DW), Meltano, Databricks, AWS, Airbyte
Other
Data Warehouse, Quality Management, Data Warehouse, Slowly Changing Dimensions (SCD), Data Engineering, Query Optimization, Data Quality Analysis, Database, ETL Tools, Data Science, Database Analytics, Data Processing, Business Intelligence (BI) Platforms, Performance Tuning, Data Modeling, Business Logic, Data Architecture, Logical Database Design, Database Schema Design, Relational Database Design, ELT, Schemas, Relational Data Mapping, Reporting, BI Reporting, AnyDesk, Data Migration, Database Optimization, Data Extraction, CSV Export, CSV, Data Management, Warehouses, Analytics Development, English, Data Visualization, Google BigQuery, Big Data Architecture, Data Analysis, Big Data Architecture, Analysis, Cloud Infrastructure, CI/CD Pipelines, API Integration, Reports, Business Intelligence Development, Dashboard, Dashboard Development, APIs, DuckDB, Scripting, Cloud Engineering, Azure Data Factory, Data Build Tool (dbt), Azure Databricks, Prometheus, Node Exporter, MinIO, S3 Buckets, Data Structures, OOP Designs, Blink SQL, Data, Data Science, Azure Service Bus, Microsoft Azure, Apache Superset, Workflow Automation, Hopworks, Feature Engineering, Data Science
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring