Zaynul is available for hire

Zaynul Abadin

Verified Expert in Engineering

Data/BI Analyst and Developer

Location

Dhaka, Dhaka Division, Bangladesh

Toptal Member Since

June 18, 2020

Zaynul is a developer with 12+ years of experience—specializing in databases, data warehouses, data integrations/conversions, data pipelines, big data architecture, and data visualization using various BI tools, analytics platforms, and cloud architecture. He’s delivered many E2E data-related projects including PoC, architecture design, drive analytics, and team management. Where Zaynul really shines is in building data platforms, analytics consulting, data modeling, and BI solution creation.

Portfolio

Insightin

Databricks, PySpark, Dedicated SQL Pool (formerly SQL DW)...

Fusebox

Python, Google Cloud Platform (GCP), Data Engineering, ETL, Apache Airflow...

Australia Based Law Firm (Toptal Engagement)

Data Warehousing, Data Warehouse Design, Data Engineering, ETL, Data Pipelines...

Experience

Data Pipelines - 11 years Data Modeling - 11 years Data Warehouse Design - 10 years ETL Pipelines - 8 years Business Intelligence (BI) - 8 years Python - 6 years Apache Airflow - 2 years Amazon Redshift Spectrum - 2 years

Availability

Part-time

Preferred Environment

PyCharm, Git, Microsoft SQL Server, Ubuntu, Amazon Redshift Spectrum, Azure SQL, Amazon Web Services (AWS), Visual Studio Code (VS Code)

The most amazing...

...project I've worked had the implementation of phishing data ingestion, transformation, data warehousing, and an exemplary data visualization architecture.

Work Experience

Principal Data Engineer

2020 - PRESENT

Insightin

Created a consistent, reliable, and highly available data platform on which analysts can easily derive insight to power business decision-making, facilitate optimization and identify new customer and product opportunities.
Built and led a team of best-in-class data engineers and data scientists in projects involving Azure Cloud storage for data warehouses, a custom portal for web analytics functions, and Power BI for business intelligence reporting.
Provided technical leadership and hands-on implementation in data techniques, including data access, integration, modeling, visualization, mining, design, and implementation.
Implemented ETL and ELT using Python PySpark and chose Docker-based Apache Airflow for managing tasks.
Demonstrated expertise in efficiently managing and processing massive volumes of insurance data exceeding terabytes, received through secure file transfer protocol (SFTP).
Managed data from multiple sources proficiently, including APIs and RDBMS, and expertly handled data files in various formats, such as Avro, ORC, Parquet, TXT, JSON, PGP, and EDI.

Technologies: Databricks, PySpark, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), Microsoft Power BI, Azure Data Factory, Azure Data Lake, Azure, Apache Airflow, Docker Compose, Python, Data Engineering, Data Pipelines, Data Modeling, Data Warehouse Design, Data Warehousing

Data Engineer

2022 - 2022

Fusebox

Designed and built the architecture of the Fusebox Games data platform.
Developed and designed a Delta Lake using Google Cloud Storage.
Created a data pipeline using PySpark to integrate various API data sources like AppsFlyer and Appfigures.
Implemented an Airflow data orchestration environment using Docker and Google Kubernetes Engine (GKE).
Built a data warehouse from which the data science team developed several recommendation reports for the financial and other business departments.
Created a stored procedure in BigQuery with scheduling based on the requirements.

Technologies: Python, Google Cloud Platform (GCP), Data Engineering, ETL, Apache Airflow, Data Lakes, Data Architecture, Firebase, SQL, PySpark, Google Kubernetes Engine (GKE), Delta Lake, BigQuery, Google Cloud Storage

Data Warehouse Consultant

2022 - 2022

Australia Based Law Firm (Toptal Engagement)

Conceptualized, designed, developed, and deployed data architecture and data models.
Worked together with various business units (BI, product, reporting) to develop the data warehouse platform vision, strategy, and roadmap.
Designed and implemented detailed data warehouse models and data mappings.
Developed various ETL processes and deployed them to the stage and production environments.
Implemented performance optimizations and tuning on data warehouse implementations.
Integrated data from Oracle and an ELMO API (a cloud-based HR and payroll system) to the data warehouse.

Technologies: Data Warehousing, Data Warehouse Design, Data Engineering, ETL, Data Pipelines, SQL Server Integration Services (SSIS), SQL, SQL Server 2016, Oracle, APIs

Data Engineer

2019 - 2020

Pipeline Pte, Ltd.

Created the architecture of the pipeline's analytics database by using Redshift Spectrum and also built a data warehouse.
Implemented a data processing ETL to ingest data into AWS S3 using Python-PySpark.
Developed a URL analysis algorithm that helps to detect crucial insights.
Built a dashboard in Redash and QuickSight for different levels of users.

Technologies: Amazon QuickSight, Redash, PySpark, Amazon Athena, Amazon S3 (AWS S3), Redshift, Python, ETL, Data Engineering, Data Pipelines

Lead BI and Data Analyst

2015 - 2020

Augmedix

Spearheaded the creation of a data ecosystem in Augmedix, including developing the proof of concept (POC), architecture design, drive analytics, and data team management. Aligned the project delivery on time with expectations.
Designed and developed companies' analytics database for data warehousing in SQL Server 2016 AWS Instance.
Built a data pipeline using Python and PDI (ETL) packages (hosting ETL on AWS) dealing with different data sources (MySQL, AWS S3, Google Spreadsheets, CSV files, product logs), which was transformed and ingested into a warehouse.
Developed a complex algorithm for automated billing data generation for the accounts team, hugely impacting companies' money-saving strategy.
Constructed a sophisticated product feature analysis system that utilized product logs, streamlining the process of uploading documents to the EHR system.
Integrated data from different APIs (SalesForce, Survey Monkey, Graphana, Freshdesk, Humanity) and other sources using PDI and Python into a warehouse.
Developed analytical reports and dashboards for end users using Sisense and Data Studio.

Technologies: Google Data Studio, Sisense, Pentaho Report Designer, Pentaho Data Integration (Kettle), Pandas, PySpark, Amazon S3 (AWS S3), Amazon Web Services (AWS), MySQL, SQL Server 2016, Python, ETL, Data Engineering, Data Warehouse Design, Data Warehousing, Data Pipelines

Senior BI and Data Analyst

2013 - 2015

Metatude Asia, Ltd.

Analyzed a customer's requirements and data modeling using a star schema according to business requirements.
Constructed data warehousing systems in HP Vertica for business intelligence reporting and analytics.
Developed a data pipeline (ETL) using PDI to ingest data into the warehouse.
Designed a report template for an ad-hoc report development using Pentaho Report Designer.
Built an automated data validation process for testing ETL processes and reports.

Technologies: Schema Workbench, Pentaho Reports, Pentaho, SAP BusinessObjects (BO), Vertica, Microsoft SQL Server

Software Engineer

2012 - 2013

DIRD Group

Performed tuned an existing database query (SQL Server 2008).
Developed various reports using the Crystal Reports.
Coded in C# which helped to introduce new features and enhanced the existing ERP system.
Examined legacy code and debugged various legacy features.

Technologies: C#.NET, Microsoft SQL Server

Software Engineer

2008 - 2011

United Group International

Designed a back-end database and coded stored procedures and functions for an insurance management system.
Developed a management information reporting system.
Coded in C# to implement various data capturing methods for an insurance management system.

Technologies: C#.NET, Microsoft SQL Server

Experience

Data Ecosystem for Augmedix

Augmedix is the nation's leading provider of remote medical documentation and live clinical support services. It's a place where virtual medical scribes work directly with the doctor, so huge amounts of data are generated from its various tools.

Augmedix has various departments and they all needed data to measure whether they achieved their business goals, they wanted to control the flow of business, and also to help the product team to analyze day-to-day processes and feature usage. We needed to take this massive influx of data and glean from it meaningful takeaways so that different departments can easily take this information to achieve company milestones.

I was the one who got to build this data ecosystem for Augmedix.

Metatude BI Solution

Metatude introduced the ITsat product to its customers—where the "sat" stands for satisfaction—and it is a tool designed to improve the quality of IT services and manage costs. ITsat measures end-users overall experience with IT services, in terms of both effectiveness (doing the right things) and efficiency (doing things the right way). There were 25 large companies that used this tool that had terabytes worth of data generated each month. Metatude provided ad-hoc reports and dashboards to its satisfied customers by using data warehouse and BI tools like SAP BusinessObjects.

I contributed to the build of that system.

Phishing Data Ecosystem

We need to collect data from six different phish feeds and then push it to a data lake. The business user requirements needed to view the data from QuickSight and Redash, so we developed a data warehouse using Amazon Redshift Spectrum. When ingesting data into the warehouse (it was a huge task) so needed to generate missing values and to cleanse the data.

Education

2012 - 2014

Master's Degree in Computer Science

American International University-Bangladesh (AIUB) - Dhaka, Bangladesh

2004 - 2008

Bachelor's Degree in Computer Science & Engineering

Rajshahi University of Engineering and Technology - Rajshahi, Bangladesh

Certifications

MARCH 2022 - MARCH 2023

Azure Database Administrator Associate

Microsoft

MARCH 2020 - PRESENT

Distributed Computing with Spark SQL

Coursera

Skills

Languages

SQL, Python, C#, C#.NET, R

Tools

Amazon Redshift Spectrum, Amazon Athena, PyCharm, DataGrip, Pentaho Data Integration (Kettle), Sisense, Redash, Microsoft Power BI, Schema Workbench, Apache Airflow, Git, Amazon QuickSight, Docker Compose, Google Kubernetes Engine (GKE), BigQuery

Paradigms

Business Intelligence (BI), Database Design, ETL

Platforms

Pentaho, Azure SQL Data Warehouse, Google Cloud Platform (GCP), Dedicated SQL Pool (formerly SQL DW), Ubuntu, Azure, Databricks, Amazon Web Services (AWS), Visual Studio Code (VS Code), Oracle, Firebase

Storage

SQL Server 2012, PostgreSQL, Amazon S3 (AWS S3), MySQL, Vertica, Azure SQL, SQL Server Integration Services (SSIS), Data Lake Design, Microsoft SQL Server, SQL Server 2016, Redshift, Azure SQL Databases, Database Performance, Data Pipelines, Data Lakes, Google Cloud Storage

Other

ETL Pipelines, Data, Pentaho Reports, Azure Data Lake, Data Warehouse Design, Azure Data Factory, SAP BusinessObjects (BO), Pentaho Report Designer, Google Data Studio, Data Build Tool (dbt), Computer Science, Data Engineering, Data Modeling, Data Warehousing, APIs, Data Architecture, Delta Lake, Electronic Health Records (EHR), Healthcare IT

Libraries/APIs

PySpark, Pandas

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring