Zohaa Qamar, Developer in Dartmouth, NS, Canada
Zohaa is available for hire
Hire Zohaa

Zohaa Qamar

Verified Expert  in Engineering

Big Data Developer

Location
Dartmouth, NS, Canada
Toptal Member Since
December 27, 2021

With over eight years of experience working in diverse areas, Zohaa developed data lake solutions, ETL pipelines, dashboards, web applications, and back-end web services. She is eager to play with data and convert it into meaningful information, which helps the stakeholders make critical decisions for their business. Zohaa can work with a high volume of data and various data sources by providing robust, scalable, and fault-tolerant solutions.

Portfolio

NorthBay Solutions
Amazon Web Services (AWS), Apache Airflow, Python, PySpark, SQL, Apache Spark...
Techlogix
Pentaho, Microsoft Power BI, SQL Server Management Studio (SSMS), C#.NET, Azure...

Experience

Availability

Part-time

Preferred Environment

Microsoft Power BI, Apache Airflow, Spark, SQL, Amazon Web Services (AWS), ETL, Data Engineering, Python, Snowflake, Tableau

The most amazing...

...thing I've revamped is a streaming data solution, replacing AWS Kinesis Data Firehose with AWS Lambda to dump streaming data into S3, greatly reducing costs.

Work Experience

Senior Big Data Engineer

2018 - PRESENT
NorthBay Solutions
  • Developed data lake solutions for our clients, from ingesting the structured and unstructured data to transforming them according to business needs and consuming them using reports and dashboards developed using Tableau.
  • Revamped a streaming data solution by replacing Amazon Kinesis Data Firehose with AWS Lambda, reducing the overall cost by 50%.
  • Used the AWS Step Functions and Apache Airflow to orchestrate the whole data lake pipeline, which involves the integration of some Lambda functions and AWS Glue jobs.
  • Implemented auditing, logging, and a retry mechanism in the data lake pipeline.
  • Led a team of six members to develop back-end APIs using Python Flask, which were to be consumed by the front end. I was involved in sprint planning, gathering client requirements, managing sprint tasks, and helping the team with technical issues.
Technologies: Amazon Web Services (AWS), Apache Airflow, Python, PySpark, SQL, Apache Spark, AWS Lambda, AWS Glue, AWS Step Functions, Amazon RDS, Data Engineering, Data Lakes, Snowflake, Tableau, Amazon Elastic MapReduce (EMR)

Senior Software Engineer

2014 - 2018
Techlogix
  • Developed the reporting system for in-house projects where data from the source databases are brought to a single report-friendly database. Developed ETL process in Pentaho Spoon.
  • Generated the analytics dashboards and KPIs in Microsoft Power BI for different clients.
  • Implemented a trade scheme system using C#.NET and Microsoft SQL for automating the schemes and promotions applied to the various products such as free of cost (FOC) products, percentage discounts, and others.
  • Communicated with a client. Developed back-end APIs to be consumed by the front end and monitored the QA cycle for smooth project delivery to the client.
Technologies: Pentaho, Microsoft Power BI, SQL Server Management Studio (SSMS), C#.NET, Azure, ETL, Business Intelligence (BI), Data Visualization, Dashboards, BI Reporting

Data Lake Solution for Adtech

Worked on optimizing the data lake solution of adtech company, reducing the overall cost by around 50%. I orchestrated the pipeline using Apache Airflow to integrate jobs executed on Amazon EMR. In this process, I used AWS cloud services—Kinesis Data Stream, Lambda, S3, Athena, EMR—and Python and Spark to manage the processing of huge datasets. Further, I developed jobs to dump the data from the data lake into Snowflake and create Tableau dashboards on top of it.

Data Lake Solution

This complete data lake solution covers the ingestion of different data sources. The indigested structured data include Oracle and MySQL, while the unstructured data are in the form of files or data fetched via an API. This was done using the Python language. Then, I converted them to Parquet format, which is efficient and columnar. Next, I used PySpark with Spark SQL feature to transform the data into facts and dimensions, which was executed on AWS Glue.

Finally, the data were dumped into Amazon S3. Amazon Athena was used on top of S3 for the ad-hoc querying on the transformed data. Also, I created a process to load the transformed data into Amazon Redshift, making it consumable for the end users. The whole pipeline was orchestrated using AWS Step Function with complete CloudWatch logging and monitoring.

Reporting Universe

I designed and developed a reporting system for multiple clients where data from the source databases, Microsoft SQL hosted on Microsoft Azure, are brought into a single database with a flat report-friendly schema.

This helps clients query this database for their own reporting purposes instead of directly accessing the source database. It also increases the overall query performance as the SQL joins are no longer needed. I used Pentaho Spoon to create the ETL process, Microsoft Azure SQL as a database, and developed dashboards from reporting database using Microsoft Power BI.

Tools

Microsoft Power BI, AWS Glue, Amazon Athena, Apache Airflow, Spark SQL, AWS Step Functions, Tableau, Amazon Elastic MapReduce (EMR)

Languages

SQL, Python, C#.NET, Snowflake

Frameworks

Spark, Apache Spark

Paradigms

ETL, Business Intelligence (BI)

Platforms

Amazon Web Services (AWS), Pentaho, Azure, AWS Lambda

Libraries/APIs

PySpark

Storage

SQL Server Management Studio (SSMS), Microsoft SQL Server, Amazon S3 (AWS S3), Data Lakes

Other

Data Engineering, Amazon RDS, Data Visualization, BI Reporting, Dashboards

2015 - 2017

Master's Degree in Computer Science

Lahore University of Management Sciences - Lahore, Pakistan

2010 - 2014

Bachelor's Degree in Computer Engineering

University of Engineering and Technology - Lahore, Pakistan

JULY 2019 - JULY 2022

AWS Certified Cloud Practitioner

AWS

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring