Innocent Musanzikwa, Developer in Calgary, AB, Canada
Innocent is available for hire
Hire Innocent

Innocent Musanzikwa

Verified Expert  in Engineering

Data Engineer and Developer

Location
Calgary, AB, Canada
Toptal Member Since
August 10, 2021

Inno is a seasoned data engineer and developer who's worked at IRI—a top retail data analytics company—in Africa and North America for the past decade and as a freelance consultant for the past couple of years. As a SQL and ETL developer, he has created quality data warehouses using industry-standard techniques like Kimball and DataVaults. As a data engineer, Inno has built highly robust and scalable data pipelines both on-premise and on the cloud using several latest cutting-edge technologies.

Portfolio

Darwill, Inc.
SQL, Tableau, Python, Data Engineering, Data Analytics, ETL, Data Warehousing...
SFL Scientific LLC
SQL, SQL Server Integration Services (SSIS), MariaDB, Microsoft SQL Server...
Airiam Holdings, LLC
Business Intelligence (BI), SQL, APIs, SQL Server DBA, Dimensional Modeling...

Experience

Availability

Part-time

Preferred Environment

SQL, PySpark, Python, Hadoop, Apache Hive, Azure Synapse, Oracle, SQL Server Integration Services (SSIS), Azure Data Factory, Data Warehousing

The most amazing...

...big data warehousing and data integration solution I've designed—using Python, SQL, ADF, Hadoop, Hive, and Spark—won an RFP in Canada out of six competitors.

Work Experience

Data Engineer

2022 - 2022
Darwill, Inc.
  • Built Tableau dashboards and visualizations using AWS Redshift and Aurora databases.
  • Created AWS Lambda functions running Python for custom ETL tasks and ad-hoc requests.
  • Managed AWS Redshift and Aurora databases and designed data warehouses and data migrations.
  • Redesigned the client's data warehouse using the AWS tech stack and improved their migration process by introducing federated queries and Lambda functions running Python pipelines, as well as overhauling their Tableau dashboards.
Technologies: SQL, Tableau, Python, Data Engineering, Data Analytics, ETL, Data Warehousing, Amazon Web Services (AWS), Relational Databases, Data Cleansing, Data Science, Databases, PostgreSQL, AWS Lambda, Database Development, Data Visualization, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Database Modeling, MySQL, Entity Relationships, Business Analytics, Database Design

Data Engineer

2022 - 2022
SFL Scientific LLC
  • Consulted on an existing SSIS poorly designed data integration project and helped identify bottlenecks and inefficiencies.
  • Redesigned the existing data pipeline using SSIS to be efficient and scalable.
  • Performed SQL tuning and SQL code review for process efficiencies.
Technologies: SQL, SQL Server Integration Services (SSIS), MariaDB, Microsoft SQL Server, Data Transformation, Python, Database Schema Design, iPaaS, CI/CD Pipelines, Relational Databases, Stored Procedure, Data Analysis, T-SQL (Transact-SQL), SQL DML, Database Development, Data Analytics, Data Visualization, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Database Modeling, Entity Relationships, Tableau, Business Analytics, Database Design

BI and Data Warehouse Expert

2021 - 2022
Airiam Holdings, LLC
  • Designed and developed data pipelines to integrate data from Quickbooks API, Sage Intacct API, and spreadsheets into Azure SQL.
  • Designed and developed a data warehouse in Azure SQL.
  • Designed and created business reports and KPI dashboards using Power BI.
  • Developed complex SQL scripts to manage data transformations and speed up integration.
Technologies: Business Intelligence (BI), SQL, APIs, SQL Server DBA, Dimensional Modeling, Relational Databases, Microsoft Power BI, Cloud, Git, REST APIs, Synapse, DAX, Dashboard Design, Dashboards, Stored Procedure, Tableau, Data Analysis, T-SQL (Transact-SQL), SQL DML, Database Development, Data Analytics, Microsoft Power Automate, Data Visualization, Database Modeling, Entity Relationships, Business Analytics, Database Design

Data Analyst for Migration Project

2021 - 2021
JLL - JLLT Data
  • Developed the data pipeline to integrate data from Salesforce to Microsoft SQL.
  • Designed advanced SQL code, e.g., CTE, stored procedures, and functions to manage data transformations.
  • Performed SQL tuning to improve ETL efficiencies and process scalability.
  • Consulted on standard operating procedures and best case scenarios.
Technologies: SQL, T-SQL (Transact-SQL), ETL, Salesforce, Data Migration, Relational Databases, Microsoft Power BI, SQL Server Reporting Services (SSRS), Stored Procedure, Data Analysis, Google Sheets, SQL DML, Database Development, Data Analytics, Database Modeling, Entity Relationships, Tableau, Business Analytics, Database Design

Director | Data Engineering

2019 - 2021
IRI
  • Developed Azure Data Factory pipelines to integrate data from Apache Hive, HDFS, OAuth 2 APIs, and various flat-file types into Azure SQL.
  • Managed a team of onshore and offshore big data developers, assigning tasks and tracking the progress on Jira.
  • Oversaw data strategy and recommendations for new data sources and ongoing projects.
  • Mentored big data engineers to help them develop their skills.
  • Architected new data models and upgraded old data warehouses as per client request or technology change.
Technologies: Python, Apache Hive, Hadoop, Azure Synapse, Azure Data Factory, Bash Script, SQL, Azure SQL, Databricks, Data Engineering, ETL, Data Modeling, Databases, Azure, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Apache Airflow, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, Snowflake, Data Build Tool (dbt), Apache Kafka, ELT, SQL Server Integration Services (SSIS), Data Transformation, Dimensional Modeling, Relational Databases, Microsoft Power BI, Cloud, SQL DML, Database Development, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Database Modeling, Entity Relationships, Database Design

ETL Architect

2016 - 2019
IRI
  • Developed SQL-based data warehouses on-premise and on the cloud.
  • Integrated various data sources from flat files to cloud-based data sources like Snowflake, AWS and data lakes into Azure Data Warehouse, and Apache Hive on Hadoop.
  • Created scalable data pipelines and improved efficiencies on the existing ones.
  • Trained and upskilled new data developers and participated in code reviews.
  • Maintained system documentation of all business data components and strategies.
Technologies: SQL Server Integration Services (SSIS), Azure Synapse, Azure Data Factory, Databricks, PySpark, SQL, Oracle, Apache Hive, Hadoop, Data Warehouse Design, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, BigQuery, JavaScript, T-SQL (Transact-SQL), Data Migration, Snowflake, Amazon Web Services (AWS), Amazon Elastic MapReduce (EMR), ELT, APIs, Data Transformation, MariaDB, SQL Server DBA, Dimensional Modeling, Relational Databases, Microsoft Power BI, Cloud, REST APIs, SQL DML, Database Development, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Database Modeling, Entity Relationships, Performance Tuning, Dynamic SQL

SQL Lead Developer

2012 - 2016
IRI
  • Developed SQL-based data warehouses and data marts.
  • Wrote SQL queries to provide data for SSRS reports.
  • Used SSIS, Talend, and DataStage for ETL processes depending on the client's requirements.
  • Created custom business reports using SQL Server Reporting Services (SSRS).
  • Managed junior developers and ran stand-up development meetings.
Technologies: SQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), PSQL, MySQL, Data Warehousing, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, ELT, Data Transformation, Dimensional Modeling, Relational Databases, Microsoft Power BI, REST APIs, SSAS, Dashboard Design, Dashboards, SQL DML, Database Development, SSRS Reports, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Database Modeling, SQL Server 2015, Entity Relationships, Business Analytics, Performance Tuning, Dynamic SQL

SQL/ETL Developer and Consultant

2010 - 2012
Mi9 Retail (formerly JustEnough Software Corporation)
  • Managed SQL replication between mobile devices and SQL Server.
  • Created SQL data warehouses using the Kimball methodology for reporting purposes.
  • Designed and developed ETL packages using SQL Server Integration Services (SSIS).
  • Designed and developed reports in SQL Server Reporting Services (SSRS).
  • Performed database tuning and code reviews for any code being deployed to production.
Technologies: SQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Microsoft SQL Server, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, Data Transformation, Relational Databases, Microsoft Power BI, SSAS, SQL DML, Database Development, SSRS Reports, Database Modeling, SQL Server 2015, Entity Relationships

Data Migration from Azure SQL to Snowflake

https://github.com/innowarue/ADF
This project involved migrating data from an Azure SQL database to a Snowflake data warehouse using an Azure Data Factory data pipeline. It took me minutes to create it based on my skill set and proficiency in Data Factory.

I replaced the authentic data sources with my Azure and Snowflake accounts to make the project publicly available without compromising confidentiality.

Data Integration from OAuth2 API

I created an automated data pipeline to integrate data accessible via an OAuth2-based API in JSON format into a cloud-based data warehouse solution. The solution used Python and Spark on Databricks integrated into an Azure Data Factory pipeline.

SQL Server Replication to Mobile Devices

I created a replication system that synced data between mobile devices and Microsoft SQL Server. Field sales representatives would collect information from the field, upload it to SQL Server using SQL CE and download any updates from SQL Server via the mobile replication I set up.

In-place Data Integration for an Acquisition

I created an in-place ETL integration for a company acquisition and merger, bringing the two companies' data into a single warehouse while continuously delivering weekly reports to the client services and retail service teams.

Kafka Streaming and Data Integration

I created an automated data pipeline to integrate data accessible via a Kafka stream, ingesting it into Spark Streaming using Spark and Python and loading it into a Cloudera Hadoop file system accessible using a Hive data warehouse solution.

Languages

SQL, Python, Bash Script, T-SQL (Transact-SQL), Snowflake, Stored Procedure, SQL DML, Scala, JavaScript, Bash

Frameworks

Hadoop, Spark, Windows PowerShell, ADF

Libraries/APIs

PySpark, REST APIs, Spark Streaming

Tools

Microsoft Power BI, Tableau, BigQuery, Synapse, SSAS, Apache Airflow, Amazon Elastic MapReduce (EMR), Git, Google Sheets

Paradigms

ETL, Business Intelligence (BI), Dimensional Modeling, Database Development, Database Design, Data Science

Platforms

Amazon Web Services (AWS), AWS Lambda, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), Azure, Microsoft Power Automate, Azure Synapse, Oracle, Databricks, Apache Kafka, Salesforce, Zeppelin

Storage

Apache Hive, MySQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), PSQL, Microsoft SQL Server, SQL Stored Procedures, PostgreSQL, Databases, Data Pipelines, Data Integration, Relational Databases, Database Architecture, RDBMS, Database Modeling, Dynamic SQL, NoSQL, SQL Server DBA, Database Replication, Azure SQL, MariaDB

Other

Azure Data Factory, Data Warehousing, Data Analysis, Data Engineering, Data, Data Architecture, Big Data, Data Migration, ELT, Data Warehouse Design, Data Transformation, Database Schema Design, ETL Tools, Scripting Languages, Data Analytics, Data Visualization, SSRS Reports, SQL Server 2015, Entity Relationships, Business Analytics, Performance Tuning, Data Modeling, Cloud, APIs, Dashboard Design, Dashboards, Web Scraping, Data Build Tool (dbt), iPaaS, CI/CD Pipelines, DAX, Data Cleansing, Azure Databricks

2013 - 2015

Bachelor's Degree in Information Technology

University of South Africa - Pretoria, South Africa

AUGUST 2023 - AUGUST 2025

Databricks Certified Data Engineer Associate

Databricks

AUGUST 2023 - AUGUST 2025

SnowPro Core

Snowflake

DECEMBER 2020 - DECEMBER 2022

Certified Apache Spark and Hadoop Developer

Cloudera

DECEMBER 2019 - PRESENT

Analyzing Big Data with Hive

LinkedIn Learning

DECEMBER 2019 - PRESENT

Advanced NoSQL for Data Science

LinkedIn Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring