Ali Ashfaq
Verified Expert in Engineering
Data Engineer and Developer
Ali is a Google Certified Professional Data Engineer with 6+ years of experience in data engineering, database design, ETL development, and data warehouse testing with Google Cloud Platform (GCP) projects. He has delivered multiple data pipelines and end-to-end ETL processes in GCP. Ali is also skilled in data modeling for enterprise data warehouse implementation projects.
Portfolio
Experience
Availability
Preferred Environment
PyCharm, Windows, Data Warehousing, Google Sheets, Data Warehouse Design, Analytics, Ad-hoc Reporting, Business Intelligence (BI), Dashboards, Data Engineering, Snowflake, PostgreSQL, Graph Databases, Real Estate, MySQL, Back-end Development, OLTP, OLAP, APIs
The most amazing...
...EDW I've developed is a credit scoring model that streamlined the bank's loan approving and dispersing processes, increasing revenue by $500 million.
Work Experience
Senior Data Engineer
Coca-Cola Icecek
- Migrated SAP jobs to GCP Apache Airflow using different airflow operators. Built data pipelines in GCP using Python. Data transformations and cleansing were carried out using SQL queries and Python. Followed CI/CD best practices using GitHub.
- Managed Jira by overseeing project workflows, assigning tasks, tracking progress, and ensuring timely delivery. Implemented efficient ticketing systems and collaborated with teams.
- Facilitated efficient data exchange between SQL Server and GCP, enabling orders by pushing recommendations to a mobile app-connected database. This streamlined sales, enhanced inventory management, and boosted profitability.
Lead Data Engineer, Web Analytics and Insights
Dr. Barbara Sturm
- Engineered an advanced ETL pipeline that consolidated Clickstream data into Google BigQuery, enhancing data-driven strategies for a leading European online cosmetic retailer.
- Developed complex SQL queries in BigQuery, processing and analyzing vast datasets to optimize digital marketing efforts, resulting in measurable improvements in customer engagement and sales.
- Implemented and fine-tuned data visualizations in Looker Studio, presenting real-time sales, vouchers, and promotional data, which significantly supported decision-making processes for marketing and sales teams.
- Streamlined the integration of Google Analytics data into BigQuery, employing custom SQL scripts to extract nuanced insights into user behavior, which informed and enhanced the online retail marketing strategy for a premier skincare brand.
Data Analyst
Uptraded GmbH
- Analyzed user interaction data within the Uptraded app using Mixpanel, providing insights that led to a 20% improvement in user conversion rates over four weeks.
- Streamlined the data collection framework to ensure the capture of meaningful analytics, optimizing the Mixpanel set up for future data-driven strategies.
- Conducted in-depth analysis of secondhand fashion consumer trends, contributing to a platform overhaul that emphasized circular fashion and increased app retention by 15%.
- Collaborated with cross-functional teams, translating complex data into actionable strategies aligned with Uptraded's mission of sustainable fashion consumption.
- Facilitated knowledge transfer sessions for the Uptraded team, empowering them with the analytical skills necessary to leverage Mixpanel for ongoing conversion optimization initiatives.
GCP Data Engineer
Patrianna Limited
- Led the optimization of ETL processes on GCP, which reduced data processing times by 40%, enhancing the app's performance for end-users.
- Collaborated with a team of data analysts and financial experts to translate complex financial concepts into clear, actionable insights within the app, driving a user satisfaction score increase of 20%.
- Implemented BigQuery solutions to handle complex queries over large datasets, enabling the app to calculate projected savings growth and net worth estimations rapidly.
- Orchestrated secure and compliant data storage mechanisms using Cloud Storage with encryption at rest and in transit, adhering to financial data security regulations and best practices.
- Automated data ingestion workflows using Cloud Dataflow and Cloud Composer, ensuring efficient and error-free data updates across user accounts for accurate financial tracking.
Senior Data Engineer
Tealbook
- Developed a robust web scraping solution using Python, Scrapy, and Selenium. This enabled the efficient extraction of certification data from various sources. I transformed and preprocessed the data using Apache Airflow, ensuring consistency and quality.
- Collaborated with cross-functional teams, using data engineering best practices. Leveraging DBT, I defined data models and ETL processes. Robust data-quality checks and monitoring mechanisms addressed integrity and consistency issues.
- Benefited from reliable and accurate certification data that supports informed decision-making and drives business growth.
Senior Data Analytics Consultant
Systems Limited
- Built and architected multiple data pipelines and end-to-end ETL processes for data ingestion and transformation in GCP.
- Prepared and coordinated tasks among the team I managed.
- Designed and created various layers of the data lake.
- Executed test cases to identify potential issues with ETL jobs and ensure data sanity and integrity in the database.
- Integrated more than 20 sources to give customers a 360-degree view.
ETL/BI Developer
Analytics Private Limited
- Managed and mentored a team of five resources and collaborated with clients to understand data needs and maintain a close working relationship.
- Performed data modeling and led an enterprise data warehouse (EDW) implementation project.
- Designed and developed end-to-end ETL pipelines and tested processes for data validation before loading it into a data warehouse.
- Identified and implemented methodologies to ensure data integrity and quality.
Experience
Core Modules for Easy Buy
https://www.mattressfirm.com/The project used analytics and business intelligence (BI) to improve customer experience and engagement, retain potential customers, offer them the best choice, enhance the system's efficiency, and increase sales based on historical and incremental data.
I designed and built multiple end-to-end ETL processes for data ingestion and transformation in the Google Cloud Platform (GCP) and coordinated tasks among the team. The core modules allowed the system to integrate data from various heterogeneous source platforms, including Google Cloud, Amazon S3 bucket, MuleSoft API, and SFTP. It also featured the ETL process by incorporating data in the GCP deep learning and BigQuery DWH and ingesting the data in the Neo4J graph database.
I used Python, SQL, and GCP stack for data processing and Apache Airflow for data orchestration during the project. I used the Google Cloud function with Python to load data into BigQuery for on-arrival CSV files in the Google Cloud Storage (GCS) bucket and maintained raw file archival in the cloud storage.
EDW for a Private Credit Bureau
https://tasdeeq.com/My main contributions included the ETL process, EDW architecture, and aggregated data set (ADS), using technologies such as Vertica, IBM InfoSphere DataStage, and IBM Cognos.
I developed the model's architecture that was successfully launched with over 90% accuracy and efficiency. It features the ETL process of extracting loan/lease transactional data from 70+ financial institutions and loading the transformed data into the DWH.
I also prepared an ADS to aid the formation of the statistical model. It can predict and categorize the customer/borrower as either good, average, or bad and assign a default probability to every customer based on the information provided to the model.
Unified Analytics Solution for Locallogy
https://www.locallogy.com/To achieve this, I designed a robust data flow incorporating data from various APIs, such as Screaming Frog and AWR (Advanced Web Ranking). I created a potent combination of data sources by integrating Google Analytics and Google Search Console with BigQuery. Additionally, I implemented an interface on Google Sheets, seamlessly connected to BigQuery, facilitating collaboration across the organization. This adaptable solution allowed the team to efficiently insert, update, and delete data from BigQuery, enhancing their ability to deliver robust digital marketing solutions for their clients.
From a technical perspective, I leveraged several Google Cloud Platform (GCP) services, including Compute Engine, Cloud Scheduler, and BigQuery. I employed VM instances running Python ETL scripts to capture and process data, ensuring seamless integration with the APIs.
BI Solution for Crowdbotics
https://www.crowdbotics.com/By implementing an end-to-end data pipeline, I successfully gathered, transformed, and loaded data from various sources, such as Heroku, Stripe, HubSpot, Toggl, and Google Sheets, into a centralized storage system BigQuery Data warehouse. The system included comprehensive dashboards tailored for executives, project managers, team leads, and recruitment departments, enabling them to make quick, informed, data-driven decisions.
I used DBT and SQL in this project, integrating with GitHub. To accommodate the substantial volume of daily data generated by over 5000+ projects and 1000+ developers, I designed and implemented efficient ETL (Extract, Transform, Load) jobs.
These jobs seamlessly combined data from multiple sources, resulting in a unified and reliable source of truth for the organization. Through this implementation, the company gained a comprehensive and streamlined data infrastructure, providing accurate insights for enhanced decision-making.
Technically, the core data model was designed using DBT, GCP Bigquery SQL, and Looker.
Optimizing Decision-making with M&M Data Warehouse Architecture
By streamlining the information flow, it enables better insights and more informed decision-making. The M&M Data Warehouse architecture encompasses multiple areas, including the extract, transform, load (ETL) process, which ensures seamless data incorporation into the warehouse after transformation and cleaning.
Data Engineering for an Online Cosmetic Retailer
https://eu.drsturm.com/Education
Bachelor's Degree in Computer Science
University of Central Punjab (UCP) - Lahore, Pakistan
Certifications
Google Professional Cloud Architect
Google Cloud
Microsoft Certified Azure Data Engineer Associate
Microsoft Learn
Professional Data Engineer
Google Cloud
Advanced Google Analytics
Skills
Libraries/APIs
REST APIs, Node.js, Pandas, Stripe
Tools
Talend ETL, Apache Airflow, Microsoft Excel, Screaming Frog, Google Sheets, Looker, Terraform, IBM InfoSphere (DataStage), BigQuery, Google Analytics, Microsoft Power BI, Power Query, Google Cloud Dataproc, Apache Beam, PyCharm, Tableau, IBM Cognos, Google Compute Engine (GCE), GitHub, Cloud Dataflow, Google Kubernetes Engine (GKE), Jira, Stitch Data, Toggl, Zapier, Spark SQL, Google Cloud Composer
Languages
SQL, Python, C++, Python 3, Snowflake, Java, JavaScript, R
Platforms
Azure, HubSpot, Google Cloud Platform (GCP), Amazon, Databricks, Amazon Web Services (AWS), AWS Lambda, Linux, Heroku, Azure Synapse, Mixpanel
Paradigms
ETL, Business Intelligence (BI), OLAP, Search Engine Optimization (SEO), Data Science
Storage
PostgreSQL, Data Pipelines, MySQL, Data Integration, Google Cloud, MongoDB, Neo4j, Graph Databases, OLTP, Redshift, Database Replication, SQL Server DBA, Vertica, Google Cloud Storage, Docker Cloud, Database Administration (DBA), Database Performance, Microsoft SQL Server, Databases, Database Migration, Database Architecture
Frameworks
Spark, Scrapy, Selenium
Other
Data Analytics, Data Visualization, Google Data Studio, Google Cloud Functions, Data Engineering, Google BigQuery, Data Warehousing, Dashboards, Data Analysis, Google SEO, Technical Project Management, Data Warehouse Design, Data Modeling, Back-end Development, APIs, Parquet, Looker Studio, API Integration, English, Query Optimization, ETL Tools, Google Search Console, SEO Tools, Data Build Tool (dbt), Analytics, Ad-hoc Reporting, Amazon RDS, DAX, Real Estate, Pub/Sub, Performance Tuning, Data-level Security, Machine Learning, Full-stack, Web Development, Google Analytics 4, SAP, Google Pub/Sub, Streaming Data, Google Container Engine, HubSpot CRM, AWR, Designing for Data, Azure Databricks, Azure Data Factory, eCommerce, Google Cloud Dataflow, Web Analytics, Social Media Web Traffic, Digital Marketing, Database Analytics, Data Processing, Unstructured Data Analysis, Dashboard Development, Reporting, Data Architecture, Sharding, Architecture, Back-end, Cloud Infrastructure, Data Structures, Advisory, Consulting, ELT, Infrastructure
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring