Senior Data Engineer
2019 - PRESENTConfidential- Build an ETL process to migrate data from the mainframe to Oracle, which enabled the client to automate manual processes, eventually saving ten million dollars per year.
- Completed data integration of customer information from the mainframe, SQL sources, blob containers, and flat files into a Hadoop cluster - hive data lake using ETL jobs, transformed data, and ingested into Oracle.
- Analyzed the performance of Hadoop components – hive queries, ingestion scripts, and optimized data pipelines. Resolved production defects after the root cause analysis of issues and deploy code fixes in production.
- Performed on-call production support (L3) for Hadoop and production deployment activities.
- Served as the lead and performed requirement analysis for epics, data profiling, created user stories, and worked in sprint planning activities along with the client’s scrum master and product manager.
- Created visualization reports with tools such as Tableau and Power BI.
- Built data models around the financial, insurance, and transportation domains from disparate data sources.
- Converted complex reports to work fluidly in Power BI.
Technologies: Microsoft Power BI, Oracle, Unix, Azure, Databricks, SQL, EDL, Python, SparkSenior Software Engineer
2015 - 2020Fusion Software Solution- Re-engineered systems to adapt to the GDPR for the European market. Developed a Teradata batch process to transform the staged data, load dimension, and facts tables. Created Unix/SQL/PL/SQL scripts to offload data back to the hive tables.
- Designed an ETL flow in a Control-M scheduler to trigger batch processes and informatica jobs. Set up dependencies to prevent data deadlocks and created proper alerts to notify stakeholders in case of errors and warnings.
- Integrated Digital Insight's (acquired company ) data into the data warehouse of NCR. New values related to the GL and inventory would start flowing in EDW to build reports.
- Created P/L metrics, user dashboards for reporting the highest/least profitable customers and dashboards with YTD revenue and cost metrics by the line of business. Performed data reconciliation between dashboard revenue numbers with reported revenue figures.
- Built, maintained, and tuned Tableau and Power BI dashboards for a broad variety of internal clients.
Technologies: Spark, SAP Business Object Processing Framework (BOPF), Informatica ETL, Tableau, Unix, Teradata, PL/SQLSenior Software Engineer
2014 - 2015Accenture- Gathered and defined business requirements while managing risks to improve business processes, thereby contributing to enterprise architecture development from business needs through business analysis and map processes.
- Managed ETL (Teradata, Informatica, Datastage), SQL and database performance tuning, troubleshooting, support, and capacity estimation to ensure highest data quality standards.
- Developed Informatica ETL mappings, Teradata bteq, fastexport, fastload, mload, TPT scripts, Oracle PL/SQL scripts, unix shell scripts and optimized sql queries/ETL mappings to efficiently handle huge volumes of data and complex transformations.
Technologies: Apache Hive, Hadoop, Business Objectives, Datastage, Teradata, Big Data, Data Warehouse DesignConsultant
2012 - 2014Capgemini- Created dashboards in Power BI and Tableau. Build to capture 360 degree view of customer information for leading bank in Europe.
- Designed and created data model and build batch process to populate those data models.
- Data manipulation using Power Query on top of view to provide security and improved performance.
Technologies: Azure, Apache Hive, Teradata, Microsoft Power BI, SQL, SparkSoftware Engineer
2010 - 2012Tata Consultancy Services- Build ETL processes in PostgreSQL to process a huge volume of data.
- Created meta data tables for the easy understanding of bottlenecks and build dashboards to highlight those bottlenecks.
- Managed professional services and implemented general ledger reports on Power BI. Performed advanced calculations on the database by offloading some of the process from power BI to a database, which improved performance.
Technologies: Spark, Big Data, Tableau, Python, Scala, Databricks