Satyanarayana Annepogu, Developer in Toronto, ON, Canada
Satyanarayana is available for hire
Hire Satyanarayana

Satyanarayana Annepogu

Verified Expert  in Engineering

Database Developer

Location
Toronto, ON, Canada
Toptal Member Since
October 25, 2022

Satya is a senior data engineer with over 15 years of IT experience designing and developing data warehouses for banking and insurance clients. He specializes in designing and building modern data pipelines and streams using AWS and Azure Data engineering stack. Satya is an expert in delivering modernization of enterprise data solutions using AWS and Azure cloud data technologies.

Portfolio

Millicom International Cellular SA - Main
Data Engineering, Amazon Web Services (AWS), Big Data, Spark, SQL, Python...
Heimstaden Services AB
Azure Data Factory, Data Engineering, Data Pipelines, SQL...
IBM
Amazon CloudWatch, Amazon RDS, Amazon S3 (AWS S3), Amazon EC2...

Experience

Availability

Part-time

Preferred Environment

Apache Airflow, AWS Glue, Azure Synapse, ETL Implementation & Design, Amazon S3 (AWS S3), Databricks, AWS Lambda, Python 3, Data Engineering, Big Data, Python, APIs, REST APIs, SSH

The most amazing...

...project I've done is designing, developing, and supporting cloud-based and traditional data warehouse applications.

Work Experience

Data Engineer

2023 - 2023
Millicom International Cellular SA - Main
  • Orchestrated complex data workflows using AWS Glue and Apache Airflow, ensuring the efficient and timely execution of ETL processes.
  • Implemented dynamic and scalable data pipelines that seamlessly adapt to fluctuations in data volume, enhancing system reliability and performance.
  • Architected Lambda functions to enable real-time data processing, providing instant insights and analytics capabilities.
  • Established event-driven architectures, allowing for automatic scaling and resource optimization, resulting in a responsive and cost-effective solution.
  • Implemented S3 as a centralized data repository, optimizing storage costs and streamlining data accessibility. Utilized S3 features such as versioning and lifecycle policies to ensure data integrity and efficient data lifecycle management.
  • Developed and applied intricate business rules within the data processing pipeline, enriching the analytical layer with meaningful insights.
  • Collaborated closely with business stakeholders to understand and implement domain-specific rules, ensuring the processed data aligns precisely with business requirements.
  • Conducted thorough performance optimizations, fine-tuning AWS Glue jobs and Airflow DAGs to maximize processing speed and resource efficiency.
  • Implemented scalable solutions to accommodate future data growth, providing a foundation for long-term sustainability and adaptability.
Technologies: Data Engineering, Amazon Web Services (AWS), Big Data, Spark, SQL, Python, Scala, Apache Kafka, AWS Lambda, AWS Glue, Amazon S3 (AWS S3), Data Transformation, Big Data Architecture, Amazon RDS, Message Queues, Redshift, Amazon Athena, Amazon Elastic MapReduce (EMR), APIs, REST APIs, Linux, SSH, PySpark, EMR Studio

Data Analyst

2022 - 2023
Heimstaden Services AB
  • Acted as a senior data engineer with demonstrated analyst skills and worked on ETL architecture solutions.
  • Performed requirements assessments and designed suitable data flows or data batches.
  • Handled solutions optimization and end-to-end data pipelines with data integrity.
  • Designed and developed ETL processes in AWS Glue to migrate campaign and API data with various file types (JSON, ORC, and Parquet) into Amazon RedShift.
  • Designed and developed ETL processes to extract Salesforce data and load it into Amazon Redshift.
Technologies: Azure Data Factory, Data Engineering, Data Pipelines, SQL, Business Intelligence (BI), ETL Tools, Scripting Languages, APIs, Data Wrangling, Amazon S3 (AWS S3), AWS Lambda, Spark, AWS Glue, Amazon EC2, Amazon Elastic MapReduce (EMR), Amazon RDS, Redshift, SQL Stored Procedures, Amazon Aurora, Apache Airflow, Data Analysis, Data Analytics, Amazon CloudWatch, Amazon QuickSight, AWS Data Pipeline Service, PostgreSQL 10, Dedicated SQL Pool (formerly SQL DW), Azure SQL Data Warehouse, PostgreSQL, Database Optimization, Database Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, Data Build Tool (dbt), NoSQL, Webhooks, BI Reporting, Database Migration, CDC, Data-driven Dashboards, DAX, Microsoft Power BI, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactions, MySQL, Microsoft Excel, Real Estate, Geospatial Data, OLTP, OLAP, DevOps, Data, Database Lifecycle Management (DLM), ETL Pipelines, Cloud, Jira, Entity Relationships, Amazon SageMaker, Data Feeds, Data Extraction, ETL Implementation & Design, ETL Testing, Big Data, Delta Lake, Spark SQL, Apache Kafka, Message Queues, REST APIs, SSH, PySpark

AWS Data Engineer

2020 - 2022
IBM
  • Designed and implemented data pipelines using AWS services such as S3, Glue, and RedShift.
  • Developed and maintained data processing and transformation scripts using Python and SQL. Optimized data storage and retrieval using AWS database services such as RDS and DynamoDB.
  • Built and maintained data warehouses and data lakes using AWS Redshift and Athena.
  • Implemented data security and access controls using AWS IAM and KMS. Monitored and troubleshot data pipelines and systems using AWS CloudWatch and other monitoring tools.
  • Collaborated with data scientists and analysts to provide data insights and support their data needs.
  • Automated data processing and deployment using AWS Lambda and other serverless technologies.
  • Developed and maintained ETL workflows using AWS Step Functions and other workflow tools. Stayed up-to-date with the latest AWS data services and technologies and recommended new solutions to improve data engineering processes.
Technologies: Amazon CloudWatch, Amazon RDS, Amazon S3 (AWS S3), Amazon EC2, Amazon Web Services (AWS), AWS Glue, AWS IAM, Redshift, Amazon DynamoDB, Python, SQL, PostgreSQL 10, PostgreSQL, Database Optimization, Lambda Functions, Database Architecture, Elasticsearch, AWS Cloud Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, NoSQL, Webhooks, BI Reporting, CDC, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactions, Microsoft Excel, OLTP, OLAP, DevOps, Identity & Access Management (IAM), Data, Database Lifecycle Management (DLM), ETL Pipelines, Cloud, Jira, Entity Relationships, Data Feeds, Data Extraction, Leadership, ETL Implementation & Design, Big Data, Delta Lake, Spark SQL, Apache Kafka, Message Queues, APIs, REST APIs, Linux, SSH, PySpark

Azure Data Engineer and Data Warehouse Consultant

2018 - 2020
IBM
  • Designed and developed data ingestion pipelines using ADF and processing layer using Databricks and notebooks with PySpark. Led the planning, development, testing, implementation, documentation, and support of data pipelines.
  • Implemented various aspects of the project, including pause and resume Azure SQL data warehouse using ADF, ADF pipelines with business rules use cases as reusable asset Ingestion of CSV, fixed width, and excel files.
  • Collaborated with a client and IBM ETL teams, analyzed on-premises Informatica-based ETL solutions, and designed ETL solutions using Azure Data Factory pipelines and Azure Databricks PySpark and Spark SQL.
  • Worked with technical and product stakeholders to understand data-oriented project requirements and help implement the solution's Azure infrastructure components as part of the solution to create the first usable iteration of the CPD application.
  • Orchestrated and automated the pipelines POCs with Apache Spark using PySpark and Spark SQL for various complex data transformation requirements.
  • Used PowerShell scripts for automation of pipelines and Azure Data Factory and Azure Databricks for performance tuning of pipelines.
Technologies: Autosys, Azure Data Factory, Azure Databricks, Azure SQL, Azure SQL Databases, Azure Synapse, Data Engineering, SQL, Data Pipelines, JSON, ETL, T-SQL (Transact-SQL), Python, Pipelines, Data Management, Azure, Dimensional Modeling, Data Lakes, Data Architecture, Microsoft SQL Server, Migration, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Databricks, Relational Databases, Databases, Analytics, Azure Data Explorer, Consulting, Python 3, CSV File Processing, XLSX File Processing, CSV, Postman, Business Intelligence (BI), ETL Tools, Data Migration, Scripting Languages, Orchestration, Machine Learning, APIs, Technical Project Management, Kanban, ETL Development, Data Wrangling, Amazon S3 (AWS S3), Big Data, AWS Lambda, Spark, AWS Glue, Data Transformation, Amazon EC2, Amazon Elastic MapReduce (EMR), Amazon RDS, Redshift, SQL Stored Procedures, Normalization, Scala, Shell Scripting, Architecture, Data Integration, Google Cloud Platform (GCP), Amazon Aurora, Apache Airflow, Data Analysis, Data Analytics, Pandas, Amazon Web Services (AWS), AWS IAM, Amazon CloudWatch, Amazon DynamoDB, PostgreSQL 10, Azure SQL Data Warehouse, Dedicated SQL Pool (formerly SQL DW), PostgreSQL, Database Optimization, Database Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, Data Build Tool (dbt), NoSQL, Database Migration, Data-driven Dashboards, DAX, Microsoft Power BI, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactions, MySQL, Microsoft Excel, OLTP, OLAP, Data, Database Lifecycle Management (DLM), ETL Pipelines, Cloud, Jira, Entity Relationships, Data Extraction, Leadership, ETL Implementation & Design, ETL Testing, Delta Lake, Spark SQL, Apache Kafka, Message Queues, REST APIs, SSH, SQL Server Integration Services (SSIS), PySpark

Senior ETL Consultant and Team Lead

2009 - 2018
IBM
  • Developed solutions in a highly demanding environment and provided hands-on guidance to other team members. Headed complex ETL requirements and design and assessed requirements for completeness and accuracy.
  • Implemented Informatica-based ETL solution fulfilling stringent performance requirements. Collaborated with product development teams and senior designers to develop architectural requirements to ensure client satisfaction with the product.
  • Determined if requirements were actionable for the ETL team and conducted an impact assessment to determine the size of effort based on needs.
  • Developed entire Software Development Lifecycle (SDLC) project plans to implement ETL solutions and identify resource requirements.
  • Assisted and verified solutions design and production of all design phase deliverables. Managed the build phase and quality assurance code to fulfill requirements and adhere to ETL architecture. Resolved difficult design and development issues.
  • Provided the team with the vision of the project's objectives, ensured discussions and decisions led toward closure, and maintained healthy group dynamics.
  • Familiarized the team with customer needs, specifications, design targets, development process, design standards, techniques, and tools to support task performance.
  • Performed an active, leading role in shaping and enhancing overall ETL Informatica architecture. Identified, recommended, and implemented ETL process and architecture improvements.
Technologies: Informatica ETL, Netezza, Autosys, Unix Shell Scripting, IBM Db2, Data Engineering, SQL, Data Pipelines, JSON, ETL, Pipelines, Data Management, Informatica, Informatica Cloud, Data Modeling, Dimensional Modeling, PL/SQL, Data Architecture, Query Optimization, Query Composition, Performance Tuning, Data Warehousing, Relational Databases, Databases, Analytics, Consulting, XLSX File Processing, CSV, Business Intelligence (BI), ETL Tools, Scripting Languages, Orchestration, Technical Project Management, Kanban, ETL Development, Data Wrangling, SQL Stored Procedures, Normalization, Shell Scripting, Architecture, Data Analysis, Data Analytics, Excel Macros, Pandas, Amazon Web Services (AWS), AWS IAM, Amazon CloudWatch, Amazon QuickSight, AWS Data Pipeline Service, Database Optimization, Database Architecture, Oracle PL/SQL, PL/SQL Tuning, CI/CD Pipelines, Excel 2016, Database Administration (DBA), Database Structure, Database Transactions, Transactions, MySQL, Microsoft Excel, OLTP, OLAP, Data, Database Lifecycle Management (DLM), ETL Pipelines, Cloud, Jira, Entity Relationships, Leadership, ETL Implementation & Design, Big Data, Delta Lake, Spark SQL, SQL Server Integration Services (SSIS)

Senior ETL Developer

2008 - 2009
Genesys
  • Developed mapping for type two dimension for updating already existing rows and inserting new rows in targets. Worked on actuating for formatting reports related to different processes.
  • Created and developed actuate reports like drill-up and drill-down, series, and parallel. Analyzed the number of reports generated, failed, waiting, and scheduled.
  • Built dashboards for generated, failed, waiting, and scheduled reports concerning quarter-hour, hour, day, month, and year.
Technologies: Informatica ETL, Unix Shell Scripting, Control-M, Data Engineering, SQL, Data Pipelines, JSON, ETL, Pipelines, Data Management, Informatica, PL/SQL, Data Architecture, Query Optimization, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Relational Databases, Databases, CSV, ETL Tools, Orchestration, Kanban, ETL Development, Data Wrangling, SQL Stored Procedures, Shell Scripting, Data Integration, Excel Macros, Database Optimization, Oracle PL/SQL, PL/SQL Tuning, Excel 2016, Database Transactions, Microsoft Excel, OLTP, OLAP, Data, Database Lifecycle Management (DLM), ETL Pipelines, eCommerce, Data Extraction, ETL Implementation & Design, Spark SQL

Senior ETL Developer

2007 - 2008
Magna Infotech Ltd
  • Managed ETL development and data warehousing application support activities.
  • Acquired hands-on experience in dimensional modeling up to ETL design.
  • Developed mapping for type two dimension for updating existing rows and inserting new ones in targets.
Technologies: Informatica ETL, Unix Shell Scripting, Oracle, Data Engineering, SQL, Data Pipelines, ETL, Pipelines, Data Management, Informatica, Dimensional Modeling, PL/SQL, Data Architecture, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Relational Databases, Databases, ETL Tools, ETL Development, SQL Stored Procedures, Excel Macros, Oracle PL/SQL, PL/SQL Tuning, ETL Pipelines, Data Extraction, ETL Implementation & Design

Tool Client Rate (TCR) Desk

TCR Desk is a web-based tool providing authoritative cash management pricing arrangements and contact information for mid and large corporate client segments. The business contact center, relationship managers, and cash management sales personnel utilize the application.

TCR Desk application migration solution leverages best practices of Azure's Well-architected framework in compliance with the client's Azure Service Governance rules to make the solution secure, resilient, highly available, and scalable. These design principles are for implementation in the client's Azure production environment. The same design will be implemented in disaster recovery and lower environments without high availability and disaster recovery.

Contribution
• Designed and developed data ingestion pipelines using ADF and a processing layer using Databricks and notebooks with PySpark.
• Led the planning, design, development, testing, implementation, documentation, and support of data pipelines.
• Collaborated with ETL teams, both client and IBM.
• Analyzed on-premises Informatica-based ETL solutions and designed ETL solutions using Azure Data Factory pipelines, Azure Databricks, PySpark, and Spark SQL.

Customer Profitability Insights (CPI)

The Business Banking Customer Profitability (BBCP) project aims to develop a new profitability analysis platform for business banking and expand its usage from the over $5 million credit segment to all client credit segments.

Contribution
• Developed solutions in a highly demanding environment and provided hands-on guidance to other team members.
• Headed complex ETL requirements and design.
• Implemented Informatica-based ETL solution fulfilling stringent performance requirements.
• Collaborated with product development teams and senior designers to develop architectural requirements to ensure client satisfaction with the product.
• Assessed requirements for completeness and accuracy.
• Determined if requirements are actionable for the ETL team.
• Conducted impact assessment and determined the size of effort based on requirements.
• Developed complete SDLC project plans to implement ETL solutions and identify resource requirements.
• Performed an active, leading role in shaping and enhancing overall ETL Informatica architecture.

Achmea Solvency II

This project aims to establish a revised set of EU-wide capital requirements and risk management standards that will replace the current solvency requirements. It consists of four releases.

Solvency II enforces that all material risks of an insurer need to be more transparent in such a way that it can calculate what capital needs to be kept as coverage for unforeseen circumstances. Driven by these requirements and legislation, Achmea started the Value Management program.

A vital program result is the realization of an automated reporting facility by an integrated actuarial data warehouse.
• Release-1: Life 400 insurance
• Release-2: Non-life insurance
• Release-3: ALI/AMIS
• Release-4: VITALIS

Contribution
• Headed in practical knowledge transfer sessions with modelers.
• Led technical design meetings for designing individual layers.
• Analyzed functional design documents and prepared analysis sheets for individual layers.
• Extensively worked on technical design generation set of documents and amended as suitable for the current release.

Data Analyst – Azure Data Factory Expertise

I was a senior data engineer with analyst skills working on ETL architecture solutions, Requirements assessments, and designing suitable data flows or batches. Also, I performed solutions optimization and end-to-end data pipelines with data integrity.

Languages

SQL, Python, T-SQL (Transact-SQL), Python 3, Snowflake, XML, C, C++, Pascal, R, Scala

Frameworks

Apache Spark, Spark

Libraries/APIs

PySpark, REST APIs, Pandas

Tools

Informatica ETL, Autosys, AWS Glue, Tableau, Spark SQL, Amazon Athena, Postman, Amazon Elastic MapReduce (EMR), Apache Airflow, AWS IAM, Amazon CloudWatch, Amazon QuickSight, GitHub, Excel 2016, Microsoft Excel, Jira, Control-M, Google Analytics, Power Query, Microsoft Power BI, Amazon SageMaker

Paradigms

ETL, Dimensional Modeling, Business Intelligence (BI), OLAP, ETL Implementation & Design, Kanban, Database Design, DevOps, Data Science

Platforms

Oracle, Azure, Databricks, Amazon Web Services (AWS), Azure Synapse, Azure SQL Data Warehouse, Amazon EC2, Apache Kafka, Linux, Dedicated SQL Pool (formerly SQL DW), AWS Lambda, Google Cloud Platform (GCP), Microsoft Power Automate

Storage

Netezza, IBM Db2, Database Management Systems (DBMS), Data Pipelines, Relational Databases, Databases, PostgreSQL, SQL Stored Procedures, Data Integration, Database Architecture, Oracle PL/SQL, NoSQL, Database Transactions, MySQL, Database Lifecycle Management (DLM), Azure SQL Databases, Azure SQL, JSON, Data Lakes, PL/SQL, Microsoft SQL Server, Redshift, Amazon Aurora, AWS Data Pipeline Service, PostgreSQL 10, Amazon DynamoDB, Database Administration (DBA), Database Migration, Database Structure, OLTP, Apache Hive, SQL Server Integration Services (SSIS), Amazon S3 (AWS S3), Datadog, Elasticsearch

Other

Azure Databricks, Unix Shell Scripting, Informatica, Data Engineering, Pipelines, Data Management, Data Modeling, Data Architecture, Migration, Query Composition, Data Warehouse Design, Data Warehousing, CSV File Processing, CSV, ETL Tools, Scripting Languages, Orchestration, Technical Project Management, ETL Development, Data Transformation, Normalization, Shell Scripting, Architecture, Data Analysis, Data Analytics, Database Optimization, PL/SQL Tuning, Data Build Tool (dbt), DAX, Transactions, Data, ETL Pipelines, Cloud, Data Feeds, Data Extraction, Leadership, Delta Lake, Azure Data Factory, Azure Data Lake, Informatica Cloud, Query Optimization, Performance Tuning, Analytics, XLSX File Processing, Data Migration, APIs, Data Wrangling, Big Data, Amazon RDS, Excel Macros, Lambda Functions, Big Data Architecture, AWS Cloud Architecture, CI/CD Pipelines, Webhooks, BI Reporting, CDC, Data-driven Dashboards, Business Services, Identity & Access Management (IAM), Entity Relationships, Message Queues, SSH, EMR Studio, Azure Data Explorer, Consulting, Machine Learning, Google Analytics 4, Data Visualization, Real Estate, Geospatial Data, AWS Certified Cloud Practitioner, Microsoft Azure, eCommerce, ETL Testing

1998 - 2002

Bachelor's Degree in Technology and Electrical Engineering

Jawaharlal Nehru Technological University - Hyderabad, India

JUNE 2023 - JUNE 2026

AWS Certified Cloud Practitioner

AWS

DECEMBER 2021 - DECEMBER 2022

Azure Data Engineer

Microsoft

AUGUST 2021 - PRESENT

Microsoft Azure Fundamentals

Azure

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring