Pawan Warade, Developer in Tokyo, Japan
Pawan is available for hire
Hire Pawan

Pawan Warade

Verified Expert  in Engineering

Bio

Pawan is a data engineer with 9+ years of experience in data warehouse design, solution architecture, and managing offshore teams. He implements end-to-end ETL pipelines in Agile with Talend, Informatica, and PySpark and creates modern data lakes with Snowflake. Pawan also works with databases like Teradata, Greenplum, and Hive and drives informed decisions by crafting insightful visualizations with Tableau and Power BI.

Portfolio

Indo-Sakura Software Japan
Snowflake, Amazon S3 (AWS S3), Talend ETL, Informatica ETL, Azure Data Factory...
Capgemini India
Tableau, Talend ETL, Python, Data Governance
Tata Consultancy Services
Informatica, Teradata, Talend ETL

Experience

  • Talend ETL - 5 years
  • Data Modeling - 4 years
  • Snowflake - 3 years
  • Informatica ETL - 3 years
  • Python - 3 years
  • Business Intelligence (BI) - 3 years
  • Amazon S3 (AWS S3) - 3 years
  • Power BI Desktop - 2 years

Availability

Part-time

Preferred Environment

Snowflake, Talend ETL, Informatica ETL, Azure Data Factory, Azure Databricks, Power BI Desktop, Teradata, Greenplum, Business Intelligence (BI), Data Warehousing

The most amazing...

...thing I've designed is a scalable data lake using Snowflake and Talend, enabling Agile and efficient data management and analytics.

Work Experience

Data Engineer

2020 - PRESENT
Indo-Sakura Software Japan
  • Designed comprehensive ETL workflows and documented requirements, converting them into Talend-specific syntax for implementation. Created reusable frameworks in Talend for similar job types, enhancing maintainability and scalability.
  • Developed robust data models, including detailed entity lists and attribute lists, to ensure data integrity and consistency. Implemented data quality checks and validation routines to ensure data accuracy, completeness, and consistency.
  • Developed Talend jobs to efficiently import data from REST APIs and ServiceNow, ensuring seamless data integration. Implemented best practices for optimizing data pipelines, including performance tuning.
  • Successfully migrated the existing HDFS file system to AWS S3, optimizing data storage and access.
  • Designed and implemented Snowflake pipelines using stages, Snowpipe, streams, and tasks to handle semi-structured data ingestion and processing efficiently.
  • Created secure and shared views in Snowflake to facilitate controlled data access and collaboration. Utilized the Snowflake COPY command efficiently to load data into Snowflake, ensuring optimal performance.
  • Developed automation scripts in Python and Shell for various data engineering tasks, including data extraction, transformation, and loading (ETL).
  • Identified key performance indicators (KPIs) and created interactive dashboards in Power BI for actionable insights.
  • Analyzed existing Informatica workflows to facilitate their migration to Azure Data Factory, resulting in improved scalability and a 20% reduction in operational costs.
Technologies: Snowflake, Amazon S3 (AWS S3), Talend ETL, Informatica ETL, Azure Data Factory, Big Data, HDFS, Power BI Desktop, Business Intelligence (BI), Microsoft Power BI, Data Analysis, Data Warehousing, Data Architecture, Data Visualization, SQL

ETL and BI Developer

2018 - 2019
Capgemini India
  • Involved in all activities from scratch, including designing and developing ETL jobs according to requirements and specifications using agile methodology and Jira, ensuring timely and accurate data integration.
  • Designed ETL mappings and workflows to fetch data from multiple sources (e.g., .xml, .csv, .txt) or databases and loaded data into relational tables or files, increasing data accessibility by 40%.
  • Optimized ETL jobs to achieve maximum execution speed and data transfer efficiency by enabling multi-thread execution and utilizing various optimization and parallelization options in Talend, reducing processing time.
  • Created Tableau data sources and dashboards based on user requirements, enhancing data visualization and reporting capabilities and leading to a 30% increase in user satisfaction.
  • Implemented data blending, customized queries, and complex table calculations in Tableau, enabling more sophisticated data analysis and insights.
Technologies: Tableau, Talend ETL, Python, Data Governance

ETL Developer

2014 - 2017
Tata Consultancy Services
  • Created mappings and mapplets using various transformations, scheduled sessions in Informatica's Workflow Manager, modified Teradata structures, and developed shell scripts for data file management.
  • Converted Informatica ETL jobs to Talend and migrated Teradata BTEQ scripts to Greenplum. Managed Talend jobs through Job Conductor and developed jobs for various file formats, automating FTP data retrieval.
  • Collaborated with colleagues, contributing to our collective success. Quickly grasped complex concepts and efficiently resolved bugs, enhancing project quality.
Technologies: Informatica, Teradata, Talend ETL

X360 Data Lake

I developed an ETL design document—converting requirements into Talend syntax—and created a data model with entity and attribute lists for structured data management. To enhance efficiency, I established a reusable Talend framework for similar job types and built Talend jobs for importing data from REST APIs and ServiceNow. I managed the migration of HDFS file systems to Amazon S3, improving scalability and storage efficiency, and led the migration of SQL Server, Oracle database objects, and Hive tables to Snowflake, enhancing data warehouse capabilities.

Additionally, I orchestrated Talend job execution using Docker images in Amazon ECS containers, leveraging step functions and event bridge scheduler for automation. I created a Snowflake pipeline using stages, Snowpipes, streams, and tasks for semi-structured data processing and developed secured and shared views in Snowflake to ensure data accessibility and security. I also efficiently utilized Snowflake's copy command for data loading operations and managed an offshore team using Agile methodology to ensure project milestones were met. Finally, I identified KPIs and developed dashboards in Power BI for data visualization and analysis.

Insurance Campaign Management

In the insurance domain, we undertook a data engineering project to optimize the process of capturing and processing data from insurance campaigns conducted by agents.

• Utilized PySpark to convert unstructured data into structured CSV format.
• Implemented robust data transformation techniques to ensure data quality and consistency.
• Uploaded the transformed CSV files to Azure Data Lake Storage, providing scalable and secure data storage.
• Leveraged the capabilities of Azure Data Lake for efficient data management and retrieval.

ETL Processing:
• Conducted ETL (extract, transform, load) operations on the stored files using Azure Data Factory.
• Loaded the processed data into SQL Server for further analysis and reporting.

Collaboration with BI Team:
• Assisted the Power BI team in accessing the processed data.
• Created views and data marts to facilitate the creation of insightful and interactive dashboards.
2010 - 2014

Bachelor's Degree in Information Technology

Nagpur University - Nagpur, India

JUNE 2023 - PRESENT

Microsoft Certified: Azure Fundamentals

Microsoft

Tools

Talend ETL, Informatica ETL, Power BI Desktop, Microsoft Power BI, Tableau

Languages

Snowflake, SQL, Python

Paradigms

Dimensional Modeling, Business Intelligence (BI), Software Testing

Storage

Amazon S3 (AWS S3), Database Management Systems (DBMS), Teradata, Greenplum, HDFS

Frameworks

Data Lakehouse, Spark

Platforms

Azure

Other

Data Modeling, Data Warehousing, Big Data, Data Analysis, Data Governance, Data Visualization, Azure Data Factory, Azure Databricks, Software Development, Informatica, Slowly Changing Dimensions (SCD), Data Architecture

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring