Dipti Pasupalak, Developer in Bengaluru, Karnataka, India
Dipti is available for hire
Hire Dipti

Dipti Pasupalak

Verified Expert  in Engineering

Data Engineer and Developer

Bengaluru, Karnataka, India

Toptal member since September 25, 2020

Bio

Dipti is a developer/architect with 18 years of experience in data engineering, data warehousing, analytics, and product leadership. He designs and delivers large-scale, cloud-native data solutions on AWS, Azure, Databricks, and Snowflake, enabling real-time and batch processing for analytics and reporting. Dipti is experienced in migrating data platforms from on-premises to the cloud and transitioning between cloud providers.

Portfolio

Freelance Clients
Amazon Web Services (AWS), AWS Glue, Amazon Simple Notification Service (SNS)...
Milliman, Inc. - Analytics
SQL, Databricks, ETL, Python, Spark, PySpark, Apache Airflow, Web Applications...
Freight Tiger
Databricks, Azure Data Factory (ADF), PySpark, Snowflake, Python...

Experience

  • SQL - 18 years
  • Database Design - 18 years
  • ETL - 18 years
  • Spark - 6 years
  • Python - 6 years
  • Databricks - 4 years
  • Azure Data Lake - 4 years
  • Azure Data Factory (ADF) - 1 year

Availability

Full-time

Preferred Environment

Spark, SQL, Databases, Python, Azure Data Factory (ADF), Databricks, Snowflake, Amazon Web Services (AWS), AWS Glue, Data Modeling

The most amazing...

...contributions I've made were to payment apps like Nokia Money, Airtel Money, GrabPay, and end-to-end data architecture for many startups.

Work Experience

Data Architect

2024 - 2024
Freelance Clients
  • Orchestrated the delivery of a data platform and analytics solutions for a renewable energy client in the AWS ecosystem from inception to completion.
  • Built an event-driven, real-time data platform on AWS, providing comprehensive data solutions for the renewable energy sector.
  • Prepared the dashboard requirement document and all the reports and their respective KPIs.
Technologies: Amazon Web Services (AWS), AWS Glue, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Redshift, Apache Iceberg, Amazon Athena, Amazon S3 (AWS S3), Microsoft Power BI, Data Lakes, Data Modeling, Data Analytics, PySpark, Big Data, Apache Spark, APIs, Data Pipelines

Data Engineer

2022 - 2023
Milliman, Inc. - Analytics
  • Engaged in discussions, design, and review processes for the data architecture aimed at migrating a SQL Server Data Warehouse to Databricks Delta Lake.
  • Developed a PySpark framework that facilitates the process of writing data from various data sources to a Delta Lake.
  • Developed PySpark notebooks that effectively convert the SQL Server business logic into PySpark code. It involves transforming and migrating the data, writing it into the respective bronze, silver, and gold layers within the data architecture.
Technologies: SQL, Databricks, ETL, Python, Spark, PySpark, Apache Airflow, Web Applications, Data Engineering, Azure, Databases, Database Design, Delta Live Tables (DLT), Data Cleansing, Apache, Reports, Data Lakes, Data Modeling, Data Analytics, Microsoft Data Transformation Services (now SSIS), Azure Databricks, Big Data, Apache Spark, Data Pipelines

Data Architect Consultant

2022 - 2023
Freight Tiger
  • Implemented the (ELT) process from the OLTP to the Snowflake data warehouse by utilizing Python Snowpipe and Amazon Simple Notification Service (SNS). It eliminated the 3rd-party tool Hevo, which reduced the overall cost of the data warehouse.
  • Contributed to the solution that streamlined the data pipeline, enabling efficient data extraction, loading, and transformation into Snowflake, ultimately optimizing data processing and storage.
  • Successfully remodeled the existing data analytics architecture, transitioning from Snowflake, DBT, and Zoho to a new setup involving Databricks, Azure Data Factory (ADF), and Power BI.
  • Migrated data processing and transformation tasks from Snowflake to Databricks, utilizing DBT for efficient data modeling and integrating ADF for orchestrating the data pipeline.
  • Employed Power BI for enhanced data visualization and reporting capabilities, providing a comprehensive end-to-end solution for data analytics.
  • Successfully migrated pipelines from Snowflake to Databricks. It involved transferring the data processing and transformation workflows from Snowflake to Databricks, leveraging the capabilities of Databricks for enhanced data processing and analytics.
  • Worked on the migration process to ensure a smooth transition while maintaining the integrity and efficiency of the pipelines, allowing for continued data pipeline operations within the Databricks environment.
Technologies: Databricks, Azure Data Factory (ADF), PySpark, Snowflake, Python, Apache Airflow, Amazon Web Services (AWS), Databases, Database Design, ETL Tools, Data Cleansing, dbt Cloud, Amazon S3 (AWS S3), Data Lakes, Data Modeling, Data Analytics, Azure Databricks, Big Data, Microsoft Power BI, Apache Spark, APIs, Data Pipelines

Lead Data Engineer

2018 - 2023
Grab
  • Implemented an end-to-end Azure data warehouse Delta Lake for the analytics team, which now uses the data from numerous active users to guide their business decisions.
  • Created the Azure Data Factory data pipelines while consistently maintaining 99.99% SLA. Implemented monitoring and alerts for the end-to-end data pipeline.
  • Developed the extraction, transformation, and load logic using Python, Spark, Azure Data Lake Storage, and Azure SQL data warehouse in Databricks for a large amount of data.
  • Developed and maintained the performance of the stored procedure and environment in Azure SQL data warehouse. Monitored the data warehouse database.
  • Built the data warehouse database design and the OLTP platform's database design.
  • Migrated 60 million customers from one system to a production environment with negligible downtime.
  • Implemented the data quality framework by using different statistical methods.
  • Defined the operational excellence, deployment, development, and data modeling guidelines for the entire team and mentored the team members.
  • Developed data quality monitoring reports and dashboards in Power BI and Tableau.
Technologies: Tableau, Azure SQL, Azure Data Factory (ADF), Azure Data Lake, SQL, Databases, Databricks, Spark, Python, Azure, Master Data Management (MDM), Data Analysis, Big Data, Azure Synapse, Apache Kafka, Data Cleansing, Amazon S3 (AWS S3), Data Lakes, Data Modeling, Data Analytics, PySpark, Snowflake, Azure Databricks, Microsoft Power BI, Apache Spark, APIs, Azure SQL Data Warehouse, Data Pipelines, API Integration

Data Engineer

2020 - 2021
Commonwealth Financial Network
  • Discussed and reviewed the Azure data architecture, focusing on Databricks and Azure Data Factory (ADF). Created an Architecture Design Document (ADD) that outlines the Azure implementation, encompassing the integration of Databricks and ADF.
  • Designed and implemented an SSIS (SQL Server Integration Services) ETL (extract, transform, load) pipeline to address a specific use case.
  • Created a Python utility for data lineage that provides a visual representation of the tables utilized by stored procedures and their associations with specific applications based on information extracted from log files.
Technologies: SQL, Python, ETL, Data Engineering, Data Profiling, Data Aggregation, Data Migration, SQL Server Integration Services (SSIS), Business Intelligence (BI), Master Data, Data Warehousing, Data Warehouse Design, Azure Data Factory (ADF), Azure Databricks, Databases, Database Design, Data Lakes, Data Modeling, Big Data, Apache Spark, Data Pipelines

Technical Lead

2008 - 2018
Obopay
  • Managed a team that created various solutions for different clients, like databases, data warehouses, and business intelligence.
  • Contributed hands-on to database development, data models, and database administration with relational databases like Oracle, SQL Server, PostgreSQL, and MySQL.
  • Became well-versed in relational data modeling and dimensional modeling, such as star schema for OLTP and data warehouse environments.
  • Developed various use cases in the form of PL/SQL objects, such as stored procedures, packages, functions, database triggers, etc.
  • Developed the data migration and ETL utility for migrating 3.8 million customers to the Airtel Money platform using PL/SQL.
  • Managed data warehousing and performance environment database administration responsibilities. Assisted the production DBA team concerning performance-tuning activities.
  • Implemented data replication from a production database to the data warehouse database using a materialized view.
Technologies: PostgreSQL, ETL, Database Administration (DBA), Database Development, PL/SQL, Oracle, Databases, Data Modeling, Data Analytics, Big Data, Data Pipelines

Senior Software Engineer

2007 - 2008
Emids
  • Led a team that handled database and BI report development for various projects in the healthcare and HR domains.
  • Worked as a database developer, report developer, ETL developer, and web application developer for various healthcare and HR-related projects.
  • Mentored the team and reviewed code, tuned the performance, and worked on various databases and reports related to technical assistance.
Technologies: ETL, Database Administration (DBA), Java, PL/SQL, Oracle, SQL Server Reporting Services (SSRS), Database Modeling, Database Development, SQL Server Integration Services (SSIS), SQL Server 2010, Databases, Database Design, Data Modeling, Data Analytics, Microsoft Data Transformation Services (now SSIS)

Experience

Analytics Data Warehouse Platform Development

GrabPay is a mobile super app for daily use cases such as bill payment, food ordering, ride-hailing services, and more.

I built a robust analytics data warehouse platform from scratch, supporting 500 concurrent analytics users for strategic decision-making.

Data Platform Development for Insurance Analytics

Developed a data platform from the ground up, integrating data from various companies to enable analytical decision-making for the insurance sector. I migrated the existing SQL Server SSIS project to Azure and Databricks Lakehouse.

Real-time Data Platform for Renewable Energy

Built an event-driven, real-time data platform on AWS, providing comprehensive data solutions for the renewable energy sector. I created 100 dashboards and prepared the business requirement and report requirement documents for 100 reports.

Education

2002 - 2005

Master's Degree in Computer Applications (Computer Science)

Biju Patnaik Institute of Technology - Bhubaneswar, India

1999 - 2002

Bachelor's Degree in Computer Applications (Computer Science)

Berhampur University - Berhampur, India

Certifications

JANUARY 2024 - PRESENT

Databricks Certified Data Engineer Associate

Databricks

MAY 2018 - PRESENT

Executive Certificate in Business Analytics and Big Data

Indian Institute of Management Kashipur

Skills

Libraries/APIs

PySpark

Tools

Apache, dbt Cloud, AWS Glue, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Apache Iceberg, Amazon Athena, Apache Airflow, Tableau, Microsoft Power BI

Languages

Python, SQL, Snowflake, Java

Frameworks

Apache Spark, Spark, Presto, Delta Live Tables (DLT), ADF

Paradigms

Database Design, ETL, Database Development, Business Intelligence (BI)

Platforms

Databricks, Oracle, Amazon Web Services (AWS), Azure Synapse, Azure SQL Data Warehouse, Azure, Apache Kafka

Storage

Databases, PL/SQL, Database Modeling, MySQL, Amazon S3 (AWS S3), Master Data Management (MDM), Data Pipelines, Azure SQL Databases, Azure SQL, Database Administration (DBA), PostgreSQL, SQL Server 2010, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Apache Hive, Data Lakes, Redshift, SQL Server 7

Other

Data Engineering, Data Warehousing, Data Warehouse Design, Data Modeling, Data Analytics, Data Analysis, Big Data, ETL Tools, Data Cleansing, Reports, Azure Data Factory (ADF), Azure Data Lake, Machine Learning, Microsoft Data Transformation Services (now SSIS), APIs, API Integration, Data Profiling, Data Aggregation, Data Migration, Master Data, Azure Databricks, Web Applications

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring