Kishore Kannapurakkaran, Developer in Hackensack, NJ, United States
Kishore is available for hire
Hire Kishore

Kishore Kannapurakkaran

Verified Expert  in Engineering

Data Integration Architect and Developer

Location
Hackensack, NJ, United States
Toptal Member Since
April 18, 2022

Kishore is a data integration architect with over 15 years of experience designing and developing ETL and ELT pipelines and processes to implement data warehouses, data marts, and data lakes. He has extensive experience in Azure Synapse Analytics, IBM DataStage, Pentaho Data Integration, and database systems Azure SQL Data Warehouse, SQL Server, Oracle, and IBM Db2. Kishore also has vast experience developing and maintaining stored procedures and packages in SQL Server and Oracle.

Portfolio

NYC Department of Education
Azure Synapse, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW)...
United Guaranty Corp
Datastage, Data Integration, ETL, Unix Shell Scripting, ETL Tools...
NetApp
IBM InfoSphere (DataStage), Oracle, Pentaho, Data Marts, Data Warehousing...

Experience

Availability

Part-time

Preferred Environment

Azure Synapse, Linux, Python, IBM InfoSphere (DataStage), Oracle, Azure SQL, Pentaho, Scribe Server, SQL Server 2016, Windows

The most amazing...

...thing I've developed is an ELT data pipeline architecture that leverages reusable generic data flows in Azure Synapse Analytics to load tables in an ODS.

Work Experience

Data Integration Architect

2014 - PRESENT
NYC Department of Education
  • Designed and developed an ELT framework in Azure Synapse Analytics using data factory pipelines and data flows to extract data from various on-premise systems into Azure Data Lake, transform, and load it into the Azure SQL Data Warehouse system.
  • Created views and materialized views in Oracle using complex SQL code to support various business reports. Developed stored procedure packages in Oracle and SQL Server to provide data for the school location, staff, student, and adult APIs.
  • Developed batch jobs in Scribe Workbench to update the database objects in Microsoft Dynamics 365 CRM for an edtech portal and universal pre-K enrollment outreach program. Implemented a change capture process to improve the CRM upload performance.
  • Collaborated with DOE vendors for requirements gathering, analysis, and development to create data feeds from SAS Output Delivery System (ODS) to the vendor.
  • Designed data models using erwin Data Modeler for an operational data store containing data from multiple sources. Created a source to target mapping documents with detailed data transformation rules and business logic from source systems to ODS.
  • Developed ETL jobs in IBM DataStage to process real-time student information like attendance and active directory using IBM MQ.
Technologies: Azure Synapse, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, Oracle, SQL Server 2016, Datastage, Scribe Server, ETL, SQL, PL/SQL, Unix Shell Scripting, Azure Data Lake, ETL Tools, Data Pipelines, Parquet, Oracle Database, Data Engineering, Data Migration, Data Cleansing, Databases, Microsoft SQL Server, Azure, Dimensional Modeling, Azure Data Factory

ETL Technical Lead

2013 - 2014
United Guaranty Corp
  • Collaborated with business analysts and SAP functional analysts to understand the business requirements and created a source to target mapping specifications from the legacy transactional system (AS/400) to the SAP pre-processor files.
  • Performed data profiling on source data (AS/400) to investigate special characters, nulls, and data anomalies using IBM QualityStage. Designed an error handling engine to perform data validations and error checks and manage and report data errors.
  • Developed the Unix shell, i.e., Bash scripts for file validations, FTP files, file archiving, error processing, and ETL batch load automation.
Technologies: Datastage, Data Integration, ETL, Unix Shell Scripting, ETL Tools, Data Cleansing

ETL Technical Lead

2009 - 2013
NetApp
  • Worked with solution architects to design the data model and the end-to-end ETL design and architecture. Prepared detailed design documents for end-to-end flow from source systems to staging and data warehouse.
  • Led a team of developers in multiple data warehouse projects to code, test units, and deploy ETL jobs. Provided design recommendations and code reviews, helped with root cause analysis, and suggested performance improvements.
  • Managed a database production support process addressing the resolution of data issues and enhancements.
  • Prepared project plans and coordinated with cross-functional QA and UAT teams, administrators, and database administrators (DBAs) to ensure that the tickets were resolved on time.
Technologies: IBM InfoSphere (DataStage), Oracle, Pentaho, Data Marts, Data Warehousing, SQL Performance, Performance Tuning, ETL Tools, Oracle Database, Data Migration, Data Cleansing, Databases

DataStage ETL Developer

2006 - 2009
The Vanguard Group
  • Created ETL job design templates and shared containers in IBM DataStage for loading Type 1 and 2, slowly changing dimensions and facts by leveraging a modular design concept. These reusable components saved ETL development and unit testing efforts.
  • Fixed performance issues in batch jobs by optimizing SQL, implementing indexing and data partitioning strategies, and tuning ETL code. Removed duplicate data in Type 2 SCD tables, retaining only the relevant historical data.
  • Designed and developed jobs to extract, transform, and load data into the data warehouse hub and various data marts from multiple sources like legacy Oracle databases, DB2 EnterpriseDB, and flat files from external vendors.
Technologies: IBM InfoSphere (DataStage), Oracle, SQL, Unix Shell Scripting, Data Marts, Data Warehousing, ETL Implementation & Design, ETL Tools, Oracle Database, Data Cleansing, Databases

DaaP Data Lakehouse

This Data Lakehouse is a centralized data hub based on the Azure cloud platform to facilitate data for multiple data marts and provide data feeds for external vendors and internal applications (APIs). I designed the ELT framework to fetch data from on-premise systems into the data lake, apply business rules, and load it into Azure SQL database tables. Also, I created a source to target a mapping document with detailed transformation rules.

Edtech CRM Project

This application enables schools' technical support teams to access crucial consolidated tech information for each school based on the Microsoft Dynamics 365 CRM platform. I designed and developed an ETL change capture process using IBM DataStage to update a data repository using SQL Server, which stores datasets to be uploaded into CRM. I also developed data transformations using Scribe Insight to upload the daily updates into Dynamics CRM 365.

SAP FS-CD

I built a data integration (DI) layer using IBM DataStage, which fed data into the SAP FS-CD system to replace existing general ledger (GL), finance and administration (FA), accounts receivable (AR), premium processing, collection and disbursement, and advanced placement (AD) applications. The data is sourced from a legacy IBM AS/400 system. The DI interfaces include data for a lender, borrower, loan certificate, premiums, claims, refunds, and contract underwriting.

Pentaho ETL Conversion

I converted around 2,500 ETL jobs and sequences from IBM InfoSphere DataStage to an open source tool Pentaho Data Integration (PDI). Additionally, I converted complex ETL jobs as well as the logging, auditing, and error handling components from IBM InfoSphere DataStage into PDI transformations and jobs.

AutoSupport (ASUP) Integration

I integrated AutoSupport log data from the install base system with the enterprise business intelligence (BI) analytics to provide visibility to active installed products' current state of operation and configuration. It gives a much greater depth into client systems' workings and customer behavior and better identifies cross-selling, up-selling, and tech-refresh opportunities.

Small Business Service Hub and Data Mart

I created an analytical reporting mart to track and trend the state of the business, i.e., identify trends in total assets under management, the number of accounts, plan types, cash flow, business line growth, plan movement, and more. I designed and developed ETL batch jobs in IBM DataStage to perform a daily change data capture and load Type 1 and Type 2, slowly changing dimensions and fact tables.

Analytics Enablement Hub

This is a retail data warehouse re-architecture project of a legacy system to a hub-spoke architecture. This project's goals were:

• Reducing cost and time of delivery for the data warehouse development.
• Automating a decision science process and responding faster to new decision science requests.
• Using predictive models to drive the service model and show measurable results.
• Filling data gaps in the data warehouse to support strategic questions involving the web, call center, mail, and email.
• Improving the campaign targeting, selection, and measurement.

Languages

SQL, Python

Tools

IBM InfoSphere (DataStage), Scribe Server

Paradigms

ETL, Dimensional Modeling, ETL Implementation & Design

Platforms

Oracle, Oracle Database, Azure Synapse, Linux, Pentaho, Azure SQL Data Warehouse, Azure, Dedicated SQL Pool (formerly SQL DW), Windows

Storage

SQL Server 2016, Databases, Microsoft SQL Server, Azure SQL, Data Pipelines, PL/SQL, SQL Performance, Data Integration, DB2/400, Datastage

Other

Data Warehousing, ETL Tools, Data Cleansing, Programming, Data Migration, Azure Data Lake, Parquet, Data Engineering, Azure Data Factory, Computer Automation Design (CAD), Data Marts, erwin Data Modeler, Dynamics CRM 365, Change Data Capture, Unix Shell Scripting, Slowly Changing Dimensions (SCD), Performance Tuning

2004 - 2006

Master's Degree in Engineering

University of Illinois Chicago - Chicago, Illinois, United States

1998 - 2002

Bachelor's Degree in Engineering

National Institute of Technology - Calicut, India

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring