Ali Ashfaq, Developer in Lahore, Punjab, Pakistan
Ali is available for hire
Hire Ali

Ali Ashfaq

Verified Expert  in Engineering

Data Engineer and Developer

Location
Lahore, Punjab, Pakistan
Toptal Member Since
June 7, 2022

Ali is a Google Certified Professional Data Engineer with 6+ years of experience in data engineering, database design, ETL development, and data warehouse testing with Google Cloud Platform (GCP) projects. He has delivered multiple data pipelines and end-to-end ETL processes in GCP. Ali is also skilled in data modeling for enterprise data warehouse implementation projects.

Portfolio

Coca-Cola Icecek
Analytics, BigQuery, Apache Airflow, Python 3, Data Build Tool (dbt)...
Dr. Barbara Sturm
Python, SQL, Google Cloud Platform (GCP), Google BigQuery, Looker Studio...
Uptraded GmbH
Data Analytics, Mixpanel, eCommerce, API Integration, English...

Experience

Availability

Part-time

Preferred Environment

PyCharm, Windows, Data Warehousing, Google Sheets, Data Warehouse Design, Analytics, Ad-hoc Reporting, Business Intelligence (BI), Dashboards, Data Engineering, Snowflake, PostgreSQL, Graph Databases, Real Estate, MySQL, Back-end Development, OLTP, OLAP, APIs

The most amazing...

...EDW I've developed is a credit scoring model that streamlined the bank's loan approving and dispersing processes, increasing revenue by $500 million.

Work Experience

Senior Data Engineer

2022 - PRESENT
Coca-Cola Icecek
  • Migrated SAP jobs to GCP Apache Airflow using different airflow operators. Built data pipelines in GCP using Python. Data transformations and cleansing were carried out using SQL queries and Python. Followed CI/CD best practices using GitHub.
  • Managed Jira by overseeing project workflows, assigning tasks, tracking progress, and ensuring timely delivery. Implemented efficient ticketing systems and collaborated with teams.
  • Facilitated efficient data exchange between SQL Server and GCP, enabling orders by pushing recommendations to a mobile app-connected database. This streamlined sales, enhanced inventory management, and boosted profitability.
Technologies: Analytics, BigQuery, Apache Airflow, Python 3, Data Build Tool (dbt), Google Analytics 4, SAP, GitHub, SQL, Looker, Microsoft Power BI, Google Cloud Storage, Google Cloud Functions, Cloud Dataflow, Google Pub/Sub, Streaming Data, Google Compute Engine (GCE), Docker Cloud, Google Kubernetes Engine (GKE), Google Container Engine, Data Modeling, PostgreSQL, OLAP, APIs, Data Integration, Pub/Sub, Google Cloud Dataproc, Terraform, Parquet, Google Cloud, Looker Studio, HubSpot, API Integration, English, Query Optimization, Data Processing, Databricks, Unstructured Data Analysis, Dashboard Development, Reporting, Amazon Web Services (AWS), Database Administration (DBA), Data Architecture, Database Replication, Performance Tuning, Sharding, Microsoft SQL Server, Architecture, Cloud Infrastructure, Database Migration, Database Architecture, Data Structures, AWS Lambda, Node.js, Advisory, Consulting, Full-stack, Machine Learning, Infrastructure, Spark

Lead Data Engineer, Web Analytics and Insights

2023 - 2024
Dr. Barbara Sturm
  • Engineered an advanced ETL pipeline that consolidated Clickstream data into Google BigQuery, enhancing data-driven strategies for a leading European online cosmetic retailer.
  • Developed complex SQL queries in BigQuery, processing and analyzing vast datasets to optimize digital marketing efforts, resulting in measurable improvements in customer engagement and sales.
  • Implemented and fine-tuned data visualizations in Looker Studio, presenting real-time sales, vouchers, and promotional data, which significantly supported decision-making processes for marketing and sales teams.
  • Streamlined the integration of Google Analytics data into BigQuery, employing custom SQL scripts to extract nuanced insights into user behavior, which informed and enhanced the online retail marketing strategy for a premier skincare brand.
Technologies: Python, SQL, Google Cloud Platform (GCP), Google BigQuery, Looker Studio, Google Cloud Functions, Google Cloud Dataproc, Google Analytics 4, Database Analytics, Web Analytics, Social Media Web Traffic, Digital Marketing, Dashboards, English, Query Optimization, Data Processing, Databricks, REST APIs, Unstructured Data Analysis, Dashboard Development, Reporting, Database Administration (DBA), Data Architecture, Database Performance, Database Replication, Performance Tuning, Sharding, Microsoft SQL Server, Architecture, Cloud Infrastructure, Databases, Database Migration, Database Architecture, Data Structures, SQL Server DBA, AWS Lambda, Node.js, Advisory, Consulting, Full-stack, Infrastructure

Data Analyst

2023 - 2024
Uptraded GmbH
  • Analyzed user interaction data within the Uptraded app using Mixpanel, providing insights that led to a 20% improvement in user conversion rates over four weeks.
  • Streamlined the data collection framework to ensure the capture of meaningful analytics, optimizing the Mixpanel set up for future data-driven strategies.
  • Conducted in-depth analysis of secondhand fashion consumer trends, contributing to a platform overhaul that emphasized circular fashion and increased app retention by 15%.
  • Collaborated with cross-functional teams, translating complex data into actionable strategies aligned with Uptraded's mission of sustainable fashion consumption.
  • Facilitated knowledge transfer sessions for the Uptraded team, empowering them with the analytical skills necessary to leverage Mixpanel for ongoing conversion optimization initiatives.
Technologies: Data Analytics, Mixpanel, eCommerce, API Integration, English, Query Optimization, Data Processing, REST APIs, Unstructured Data Analysis, Dashboard Development, Reporting, Database Administration (DBA), Data Architecture, Database Performance, Database Replication, Performance Tuning, Sharding, Microsoft SQL Server, Architecture, Back-end, Cloud Infrastructure, Databases, Database Migration, Database Architecture, Data Structures, Advisory, Consulting, Full-stack, Infrastructure

GCP Data Engineer

2022 - 2024
Patrianna Limited
  • Led the optimization of ETL processes on GCP, which reduced data processing times by 40%, enhancing the app's performance for end-users.
  • Collaborated with a team of data analysts and financial experts to translate complex financial concepts into clear, actionable insights within the app, driving a user satisfaction score increase of 20%.
  • Implemented BigQuery solutions to handle complex queries over large datasets, enabling the app to calculate projected savings growth and net worth estimations rapidly.
  • Orchestrated secure and compliant data storage mechanisms using Cloud Storage with encryption at rest and in transit, adhering to financial data security regulations and best practices.
  • Automated data ingestion workflows using Cloud Dataflow and Cloud Composer, ensuring efficient and error-free data updates across user accounts for accurate financial tracking.
Technologies: Google BigQuery, Google Cloud Platform (GCP), SQL, ETL, Parquet, PostgreSQL, R, Cloud Dataflow, Spark SQL, Azure Databricks, Azure, Azure Synapse, Azure Data Factory, English, Query Optimization, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Amazon Web Services (AWS), Data Architecture, Database Performance, Database Replication, Performance Tuning, Architecture, Cloud Infrastructure, Databases, Database Migration, Database Architecture, Data Structures, SQL Server DBA, Node.js, Advisory, Consulting, Full-stack, Machine Learning, Infrastructure

Senior Data Engineer

2022 - 2023
Tealbook
  • Developed a robust web scraping solution using Python, Scrapy, and Selenium. This enabled the efficient extraction of certification data from various sources. I transformed and preprocessed the data using Apache Airflow, ensuring consistency and quality.
  • Collaborated with cross-functional teams, using data engineering best practices. Leveraging DBT, I defined data models and ETL processes. Robust data-quality checks and monitoring mechanisms addressed integrity and consistency issues.
  • Benefited from reliable and accurate certification data that supports informed decision-making and drives business growth.
Technologies: Analytics, Apache Airflow, BigQuery, Business Intelligence (BI), Dashboards, Data Analysis, Ad-hoc Reporting, Data Build Tool (dbt), Data Engineering, Data Analytics, Data Pipelines, Scrapy, Selenium, Python, Data Visualization, Data Science, Data Warehousing, Data Modeling, OLAP, APIs, Data Integration, Terraform, Parquet, Google Cloud, MongoDB, Looker Studio, HubSpot, English, Query Optimization, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Data Architecture, Performance Tuning, Architecture, Cloud Infrastructure, Databases, Database Migration, Database Architecture, Data Structures, SQL Server DBA, AWS Lambda, Node.js, Advisory, Consulting, Full-stack, Machine Learning, Infrastructure, Spark

Senior Data Analytics Consultant

2020 - 2022
Systems Limited
  • Built and architected multiple data pipelines and end-to-end ETL processes for data ingestion and transformation in GCP.
  • Prepared and coordinated tasks among the team I managed.
  • Designed and created various layers of the data lake.
  • Executed test cases to identify potential issues with ETL jobs and ensure data sanity and integrity in the database.
  • Integrated more than 20 sources to give customers a 360-degree view.
Technologies: Data Analytics, Data Science, Data Engineering, Data Visualization, Google Cloud Platform (GCP), BigQuery, Google Data Studio, Apache Airflow, Google Cloud Functions, Google Compute Engine (GCE), Neo4j, Python, Google BigQuery, SQL, ETL, PostgreSQL, Data Pipelines, Data Warehousing, Dashboards, Business Intelligence (BI), Google Analytics, Microsoft Excel, Data Analysis, Screaming Frog, Google Sheets, Google Search Console, Google SEO, Microsoft Power BI, SEO Tools, Technical Project Management, Search Engine Optimization (SEO), Data Warehouse Design, Looker, Data Build Tool (dbt), Analytics, Ad-hoc Reporting, Data Modeling, Graph Databases, Real Estate, MySQL, Back-end Development, OLAP, APIs, Data Integration, Terraform, Parquet, Google Cloud, MongoDB, Looker Studio, API Integration, Query Optimization, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Data Architecture, Performance Tuning, Architecture, Back-end, Cloud Infrastructure, Apache Beam, Databases, Database Migration, Database Architecture, Data Structures, Advisory, Consulting, Full-stack, Infrastructure

ETL/BI Developer

2017 - 2020
Analytics Private Limited
  • Managed and mentored a team of five resources and collaborated with clients to understand data needs and maintain a close working relationship.
  • Performed data modeling and led an enterprise data warehouse (EDW) implementation project.
  • Designed and developed end-to-end ETL pipelines and tested processes for data validation before loading it into a data warehouse.
  • Identified and implemented methodologies to ensure data integrity and quality.
Technologies: Data Engineering, ETL Tools, Talend ETL, IBM InfoSphere (DataStage), Tableau, IBM Cognos, SQL, ETL, Data Pipelines, Data Warehousing, Dashboards, Business Intelligence (BI), Technical Project Management, Data Warehouse Design, Data Build Tool (dbt), Analytics, Ad-hoc Reporting, Data Modeling, PostgreSQL, Real Estate, MySQL, Back-end Development, OLTP, OLAP, APIs, Data Integration, Parquet, MongoDB, Looker Studio, API Integration, Query Optimization, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Data Architecture, Performance Tuning, Architecture, Cloud Infrastructure, Databases, Database Migration, Database Architecture, Data Structures, Advisory, Consulting, Full-stack, Infrastructure

Core Modules for Easy Buy

https://www.mattressfirm.com/
Built the core modules of one of the biggest eCommerce systems in the USA.

The project used analytics and business intelligence (BI) to improve customer experience and engagement, retain potential customers, offer them the best choice, enhance the system's efficiency, and increase sales based on historical and incremental data.

I designed and built multiple end-to-end ETL processes for data ingestion and transformation in the Google Cloud Platform (GCP) and coordinated tasks among the team. The core modules allowed the system to integrate data from various heterogeneous source platforms, including Google Cloud, Amazon S3 bucket, MuleSoft API, and SFTP. It also featured the ETL process by incorporating data in the GCP deep learning and BigQuery DWH and ingesting the data in the Neo4J graph database.

I used Python, SQL, and GCP stack for data processing and Apache Airflow for data orchestration during the project. I used the Google Cloud function with Python to load data into BigQuery for on-arrival CSV files in the Google Cloud Storage (GCS) bucket and maintained raw file archival in the cloud storage.

EDW for a Private Credit Bureau

https://tasdeeq.com/
Created a credit scoring model to streamline the bank's loan approval and dispersing processes, which increased revenue by $500 million.

My main contributions included the ETL process, EDW architecture, and aggregated data set (ADS), using technologies such as Vertica, IBM InfoSphere DataStage, and IBM Cognos.

I developed the model's architecture that was successfully launched with over 90% accuracy and efficiency. It features the ETL process of extracting loan/lease transactional data from 70+ financial institutions and loading the transformed data into the DWH.

I also prepared an ADS to aid the formation of the statistical model. It can predict and categorize the customer/borrower as either good, average, or bad and assign a default probability to every customer based on the information provided to the model.

Unified Analytics Solution for Locallogy

https://www.locallogy.com/
I empowered Locallogy, a small business digital marketing agency, with a comprehensive data warehouse solution that revolutionized its in-depth analysis capabilities. This solution enabled me to consolidate and analyze data from multiple sources, providing powerful insights through a unified dashboard.

To achieve this, I designed a robust data flow incorporating data from various APIs, such as Screaming Frog and AWR (Advanced Web Ranking). I created a potent combination of data sources by integrating Google Analytics and Google Search Console with BigQuery. Additionally, I implemented an interface on Google Sheets, seamlessly connected to BigQuery, facilitating collaboration across the organization. This adaptable solution allowed the team to efficiently insert, update, and delete data from BigQuery, enhancing their ability to deliver robust digital marketing solutions for their clients.

From a technical perspective, I leveraged several Google Cloud Platform (GCP) services, including Compute Engine, Cloud Scheduler, and BigQuery. I employed VM instances running Python ETL scripts to capture and process data, ensuring seamless integration with the APIs.

BI Solution for Crowdbotics

https://www.crowdbotics.com/
I significantly enhanced the performance of an intermediary company by delivering a robust business intelligence solution.

By implementing an end-to-end data pipeline, I successfully gathered, transformed, and loaded data from various sources, such as Heroku, Stripe, HubSpot, Toggl, and Google Sheets, into a centralized storage system BigQuery Data warehouse. The system included comprehensive dashboards tailored for executives, project managers, team leads, and recruitment departments, enabling them to make quick, informed, data-driven decisions.

I used DBT and SQL in this project, integrating with GitHub. To accommodate the substantial volume of daily data generated by over 5000+ projects and 1000+ developers, I designed and implemented efficient ETL (Extract, Transform, Load) jobs.

These jobs seamlessly combined data from multiple sources, resulting in a unified and reliable source of truth for the organization. Through this implementation, the company gained a comprehensive and streamlined data infrastructure, providing accurate insights for enhanced decision-making.

Technically, the core data model was designed using DBT, GCP Bigquery SQL, and Looker.

Optimizing Decision-making with M&M Data Warehouse Architecture

The M&M Data Warehouse architecture was developed to optimize the integration of large datasets extracted from multiple heterogeneous source systems. This architecture aims to enhance the efficiency of the decision-making process for Marcus & Millichap's legal property system in the United States.
By streamlining the information flow, it enables better insights and more informed decision-making. The M&M Data Warehouse architecture encompasses multiple areas, including the extract, transform, load (ETL) process, which ensures seamless data incorporation into the warehouse after transformation and cleaning.

Data Engineering for an Online Cosmetic Retailer

https://eu.drsturm.com/
As a senior data engineer for Eu.drsturm.com, an online cosmetic retailer operating in Europe and America, I spearheaded the integration and analysis of complex data sets from Clickstream into BigQuery. My role encompassed architecting and executing a robust ETL pipeline to efficiently transport data, utilizing Dataproc for advanced processing and optimization tasks. I was instrumental in transforming raw data into actionable insights, which were visualized through Looker Studio. This enabled the visualization of critical business metrics, including sales performance, voucher redemption rates, and the effectiveness of promotional campaigns. My contributions were pivotal in enhancing data-driven decision-making, optimizing marketing strategies, and ultimately driving the company's growth. My work showcased not only my technical expertise but also my ability to collaborate closely with cross-functional teams to align data analytics with business objectives.
2013 - 2017

Bachelor's Degree in Computer Science

University of Central Punjab (UCP) - Lahore, Pakistan

NOVEMBER 2022 - NOVEMBER 2024

Google Professional Cloud Architect

Google Cloud

OCTOBER 2021 - PRESENT

Microsoft Certified Azure Data Engineer Associate

Microsoft Learn

OCTOBER 2021 - OCTOBER 2023

Professional Data Engineer

Google Cloud

SEPTEMBER 2021 - PRESENT

Advanced Google Analytics

Google

Libraries/APIs

REST APIs, Node.js, Pandas, Stripe

Tools

Talend ETL, Apache Airflow, Microsoft Excel, Screaming Frog, Google Sheets, Looker, Terraform, IBM InfoSphere (DataStage), BigQuery, Google Analytics, Microsoft Power BI, Power Query, Google Cloud Dataproc, Apache Beam, PyCharm, Tableau, IBM Cognos, Google Compute Engine (GCE), GitHub, Cloud Dataflow, Google Kubernetes Engine (GKE), Jira, Stitch Data, Toggl, Zapier, Spark SQL, Google Cloud Composer

Languages

SQL, Python, C++, Python 3, Snowflake, Java, JavaScript, R

Platforms

Azure, HubSpot, Google Cloud Platform (GCP), Amazon, Databricks, Amazon Web Services (AWS), AWS Lambda, Linux, Heroku, Azure Synapse, Mixpanel

Paradigms

ETL, Business Intelligence (BI), OLAP, Search Engine Optimization (SEO), Data Science

Storage

PostgreSQL, Data Pipelines, MySQL, Data Integration, Google Cloud, MongoDB, Neo4j, Graph Databases, OLTP, Redshift, Database Replication, SQL Server DBA, Vertica, Google Cloud Storage, Docker Cloud, Database Administration (DBA), Database Performance, Microsoft SQL Server, Databases, Database Migration, Database Architecture

Frameworks

Spark, Scrapy, Selenium

Other

Data Analytics, Data Visualization, Google Data Studio, Google Cloud Functions, Data Engineering, Google BigQuery, Data Warehousing, Dashboards, Data Analysis, Google SEO, Technical Project Management, Data Warehouse Design, Data Modeling, Back-end Development, APIs, Parquet, Looker Studio, API Integration, English, Query Optimization, ETL Tools, Google Search Console, SEO Tools, Data Build Tool (dbt), Analytics, Ad-hoc Reporting, Amazon RDS, DAX, Real Estate, Pub/Sub, Performance Tuning, Data-level Security, Machine Learning, Full-stack, Web Development, Google Analytics 4, SAP, Google Pub/Sub, Streaming Data, Google Container Engine, HubSpot CRM, AWR, Designing for Data, Azure Databricks, Azure Data Factory, eCommerce, Google Cloud Dataflow, Web Analytics, Social Media Web Traffic, Digital Marketing, Database Analytics, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Data Architecture, Sharding, Architecture, Back-end, Cloud Infrastructure, Data Structures, Advisory, Consulting, ELT, Infrastructure

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring