Data Integration Architect
2014 - PRESENTNYC Department of Education- Designed and developed an ELT framework in Azure Synapse Analytics using data factory pipelines and data flows to extract data from various on-premise systems into Azure Data Lake, transform, and load it into the Azure SQL Data Warehouse system.
- Created views and materialized views in Oracle using complex SQL code to support various business reports. Developed stored procedure packages in Oracle and SQL Server to provide data for the school location, staff, student, and adult APIs.
- Developed batch jobs in Scribe Workbench to update the database objects in Microsoft Dynamics 365 CRM for an edtech portal and universal pre-K enrollment outreach program. Implemented a change capture process to improve the CRM upload performance.
- Collaborated with DOE vendors for requirements gathering, analysis, and development to create data feeds from SAS Output Delivery System (ODS) to the vendor.
- Designed data models using erwin Data Modeler for an operational data store containing data from multiple sources. Created a source to target mapping documents with detailed data transformation rules and business logic from source systems to ODS.
- Developed ETL jobs in IBM DataStage to process real-time student information like attendance and active directory using IBM MQ.
Technologies: Azure Synapse, Azure SQL Data Warehouse (SQL DW), Oracle, SQL Server 2016, Datastage, Scribe Server, ETL, SQL, PL/SQL, Unix Shell Scripting, Azure Data Lake, ETL Tools, Data Pipelines, Parquet, Oracle Database, Data Engineering, Data Migration, Data Cleansing, Databases, Microsoft SQL Server, Azure, Dimensional Modeling, Azure Data FactoryETL Technical Lead
2013 - 2014United Guaranty Corp- Collaborated with business analysts and SAP functional analysts to understand the business requirements and created a source to target mapping specifications from the legacy transactional system (AS/400) to the SAP pre-processor files.
- Performed data profiling on source data (AS/400) to investigate special characters, nulls, and data anomalies using IBM QualityStage. Designed an error handling engine to perform data validations and error checks and manage and report data errors.
- Developed the Unix shell, i.e., Bash scripts for file validations, FTP files, file archiving, error processing, and ETL batch load automation.
Technologies: Datastage, Data Integration, ETL, Unix Shell Scripting, ETL Tools, Data CleansingETL Technical Lead
2009 - 2013NetApp- Worked with solution architects to design the data model and the end-to-end ETL design and architecture. Prepared detailed design documents for end-to-end flow from source systems to staging and data warehouse.
- Led a team of developers in multiple data warehouse projects to code, test units, and deploy ETL jobs. Provided design recommendations and code reviews, helped with root cause analysis, and suggested performance improvements.
- Managed a database production support process addressing the resolution of data issues and enhancements.
- Prepared project plans and coordinated with cross-functional QA and UAT teams, administrators, and database administrators (DBAs) to ensure that the tickets were resolved on time.
Technologies: IBM InfoSphere (DataStage), Oracle, Pentaho, Data Marts, Data Warehousing, SQL Performance, Performance Tuning, ETL Tools, Oracle Database, Data Migration, Data Cleansing, DatabasesDataStage ETL Developer
2006 - 2009The Vanguard Group- Created ETL job design templates and shared containers in IBM DataStage for loading Type 1 and 2, slowly changing dimensions and facts by leveraging a modular design concept. These reusable components saved ETL development and unit testing efforts.
- Fixed performance issues in batch jobs by optimizing SQL, implementing indexing and data partitioning strategies, and tuning ETL code. Removed duplicate data in Type 2 SCD tables, retaining only the relevant historical data.
- Designed and developed jobs to extract, transform, and load data into the data warehouse hub and various data marts from multiple sources like legacy Oracle databases, DB2 EnterpriseDB, and flat files from external vendors.
Technologies: IBM InfoSphere (DataStage), Oracle, SQL, Unix Shell Scripting, Data Marts, Data Warehousing, ETL Implementation & Design, ETL Tools, Oracle Database, Data Cleansing, Databases