Verified Expert in Engineering
Data Engineer and Developer
Elaaf is a seasoned data engineer who loves designing, building, and maintaining petabyte-scale data infrastructures. He is keen on working with on-premise, cloud, and hybrid data solutions, always striving for code quality, performance, and maintainability. With exceptional communication skills, Elaaf can contribute to challenging projects and help expand data-based businesses.
MacOS, Visual Studio Code (VS Code), Slack
The most amazing...
...product I've built is a custom data integration application using purely open-source technologies.
Senior Data Engineer
- Acted as part of the global recommendations team, responsible for providing personalized restaurant/cuisine recommendations to users of 12+ sub-brands in 70+ countries.
- Developed and productized the data pipelines and serving API for a new cuisine recommendation strategy which yielded a +6% uplift in CVR in the A/B test.
- Reduced daily operational costs by 11% by optimizing Kubernetes node type/region, API code, GCP Dataflow pipelines, database resources, and Datadog logging.
- Migrated our entire services stack and data pipelines from GCP East Asia to Southeast Asia region, reducing cost by switching to nd2 machine type and reducing intra-region latency for end-users.
- Served as an on-call person for managing critical recommendation services across 11 clusters and five global regions.
Senior Data Engineer
- Led the design and development effort for a data integration platform using open-source technologies such as Airflow, Spark, and Airbyte.
- Managed a petabyte-scale data warehouse for a retail company in the Middle East, spearheading data ingestion and modeling.
- Developed a custom containerized Spark application to deploy to on-premise clusters.
- Developed and performed unit, system integration, and user acceptance testing of ETL pipelines covering over 35 distinct business streams and 12 dimensions of varying load and frequency on the Apache Hive data lake.
- Analyzed the existing Teradata SQL and its conversion to PySpark and Spark SQL with the data modeling team.
- Optimized Spark jobs and identified the most appropriate scheduling triggers using shell scripts based on business requirements and fact dependencies.
- Designed and implemented the strategy for the PII data masking and data movement of different business streams between raw, curated, and serving data lake layers.
Custom Data Integration Tool
User Stance Detection on Twitterhttps://github.com/elaaf/stance-detect
• Constructed feature vectors for each user (hashtags, retweeted accounts, unique tweets)
• Applied dimensionality reduction (t-SNE, UMAP)
• Clustered low-dimensional data (mean-shift clustering, DBSCAN)
Apache Airflow, Terraform
ETL, Business Intelligence (BI)
Azure, Google Cloud Platform (GCP), Kubernetes
Data Pipelines, Redis, Azure SQL, PostgreSQL, Azure Cosmos DB
Software Engineering, Data Engineering, ETL Tools, APIs, Cloud, Machine Learning, Airbyte, Data Visualization, Data Scraping, Data Analytics, Consulting, Costs, FastAPI
Master's Degree in Computer Science
Information Technology University of the Punjab - Lahore, Punjab, Pakistan
Bachelor's Degree in Electrical Engineering
National University of Science and Technology - Islamabad, Pakistan
Microsoft Azure Data Engineer Associate