Satyanarayana Annepogu, Developer in Toronto, ON, Canada
Satyanarayana is available for hire
Hire Satyanarayana

Satyanarayana Annepogu

Verified Expert  in Engineering

Bio

Satya is a senior data engineer with over 15 years of IT experience designing and developing data warehouses for banking and insurance clients. He specializes in designing and building modern data pipelines and streams using AWS and Azure Data engineering stack. Satya is an expert in delivering modernization of enterprise data solutions using AWS and Azure cloud data technologies.

Portfolio

Millicom
Data Engineering, Amazon Web Services (AWS), Big Data, Spark, SQL, Python...
Heimstaden
Azure Data Factory (ADF), Data Engineering, Data Pipelines, SQL...
IBM
PySpark, Azure, Azure Data Factory (ADF), Microsoft Power BI, Spark, Spark SQL...

Experience

  • Databricks - 5 years
  • Spark - 5 years
  • Azure - 4 years
  • PySpark - 4 years
  • Apache Airflow - 3 years
  • Redshift - 3 years
  • Amazon Web Services (AWS) - 3 years
  • Snowflake - 2 years

Availability

Part-time

Preferred Environment

Data Engineering, Amazon Web Services (AWS), Azure, Databricks, Python, PySpark, Hadoop, Snowflake, Data Warehousing, ETL Tools, Relational Databases, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Query Optimization

The most amazing...

...project I've done is designing, developing, and supporting cloud-based and traditional data warehouse applications.

Work Experience

Data Engineer

2023 - 2024
Millicom
  • Led the implementation of AWS Glue for automated ETL processes, reducing data processing time and improving data accuracy for telecom network performance data, customer interactions, and billing information.
  • Utilized AWS Lambda functions to develop serverless data pipelines, facilitating seamless integration between CRM systems, network infrastructure, IoT devices, and external sources within the telecom ecosystem.
  • Architected solutions using Amazon S3 (AWS S3) to optimize data storage and retrieval, implementing cost-effective and scalable data lakes to accommodate large volumes of network performance data, customer interactions, and operational metrics.
  • Orchestrated complex workflows using AWS Step Functions, ensuring efficient coordination and execution of multi-step data processing tasks for proactive network health monitoring and dynamic service provisioning.
  • Leveraged Amazon Redshift as a data warehousing solution, enabling high-performance analytical queries to derive actionable insights into network performance, customer behavior, and service usage patterns.
  • Integrated AWS Data Pipeline to automate data movement and transformation, streamlining operational processes and enhancing data availability for real-time decision-making and strategic planning.
  • Implemented robust security measures using AWS Identity and Access Management (IAM) and Amazon VPC, ensuring data privacy and regulatory compliance for sensitive network performance data, customer interactions, and billing information.
  • Leveraged AWS Lambda functions to create serverless data pipelines, ensuring seamless integration between disparate systems and services within the BN Bank Norway ecosystem.
  • Optimized data storage and retrieval by architecting solutions using Amazon S3, implementing cost-effective and scalable data lakes to accommodate the bank's growing data volumes.
Technologies: Data Engineering, Amazon Web Services (AWS), Big Data, Spark, SQL, Python, Scala, AWS Lambda, AWS Glue, Amazon S3 (AWS S3), Data Transformation, Big Data Architecture, Amazon RDS, Message Queues, Redshift, Amazon Athena, Amazon Elastic MapReduce (EMR), ETL Tools, Spark SQL, APIs, Azure Databricks, Data Pipelines, Apache Kafka, Data Integration, ETL, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Large-scale Projects, Teamwork, Data Modeling, ELT, Apache Airflow, Terraform, Webhooks, Apache Spark, CI/CD Pipelines, DevOps, MySQL, GitHub Actions, Data Scraping, Web Scraping, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Star Schema, Query Optimization

Data Analyst

2022 - 2023
Heimstaden
  • Designed and developed data ingestion pipelines using ADF and a processing layer using Databricks notebooks with PySpark.
  • Oversaw the planning, design, development, testing, implementation, documentation, and support of data pipelines.
  • Paused and resumed Azure SQL Data Warehouse using ADF. Developed multiple ADF pipelines with business rules applied as reusable assets.
  • Utilized Azure Key Vault to store connection string details and certificates and incorporated these vaults in Azure Data Factory to create linked services. Orchestrated and automated the pipelines.
  • Implemented slowly changing dimensions type 1 and 2 (SCD1 and SCD2). Processed daily, weekly, and monthly batches. Created POCs with Apache Spark using PySpark and Spark SQL to address diverse, complex data transformation needs.
Technologies: Azure Data Factory (ADF), Data Engineering, Data Pipelines, SQL, Microsoft Dynamics 365, ETL Tools, Spark, Spark SQL, APIs, Azure Databricks, Big Data, Apache Kafka, AWS Lambda, AWS Glue, Amazon S3 (AWS S3), Data Transformation, Big Data Architecture, Amazon RDS, Message Queues, Redshift, Amazon Athena, Data Integration, ETL, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Large-scale Projects, Microsoft SQL Server, Teamwork, Data Modeling, Apache Airflow, Data Build Tool (dbt), Azure Blob Storage API, Azure Data Lake, Azure Synapse, Webhooks, Apache Spark, CI/CD Pipelines, DevOps, SQL Server Integration Services (SSIS), MySQL, GitHub Actions, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Star Schema, Query Optimization

Azure Data Engineer and Tech Lead

2021 - 2022
IBM
  • Created and built data ingestion pipelines using ADF and a processing layer using Databricks notebooks with PySpark. Planned, designed, developed, tested, implemented, documented, and supported data pipelines.
  • Paused and resumed Azure SQL Data Warehouse using ADF. Developed multiple ADF pipelines with business rules applied as reusable assets. Ingested CSV, fixed-width, and Excel files.
  • Automated pipeline failure email notifications through web activity. Utilized Azure Key Vault to store connection string details and certificates for Azure Data Factory linked services.
  • Orchestrated and automated pipelines and developed POCs with Apache Spark using PySpark and Spark SQL for complex data transformations. Employed PowerShell scripts to automate the pipelines.
  • Collaborated with the client's and IBM's ETL teams, analyzed on-premise Informatica ETL solutions, and formulated an ETL solution leveraging Azure Data Factory pipelines, Azure Databricks, PySpark, and Spark SQL.
  • Optimized performance of pipelines in Azure Data Factory and Azure Databricks.
Technologies: PySpark, Azure, Azure Data Factory (ADF), Microsoft Power BI, Spark, Spark SQL, Azure Databricks, Python, APIs, ETL Tools, Data Engineering, Azure Synapse Analytics, Data Pipelines, SQL, Big Data, Apache Kafka, AWS Lambda, AWS Glue, Amazon S3 (AWS S3), Data Transformation, Big Data Architecture, Redshift, Amazon Athena, Data Integration, Financial Services, Technical Leadership, ETL, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Large-scale Projects, Microsoft SQL Server, Teamwork, Database Architecture, Data Modeling, ELT, Terraform, Data Build Tool (dbt), Excel VBA, Azure Blob Storage API, Azure Data Lake, Azure Synapse, Apache Spark, CI/CD Pipelines, DevOps, PostgreSQL, T-SQL (Transact-SQL), MySQL, Shell Scripting, GitHub Actions, Data Migration, Data Scraping, Web Scraping, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Star Schema, Query Optimization, DAX

Team Lead and Senior ETL Consultant

2014 - 2018
IBM India
  • Developed solutions in a high-pressure environment and provided hands-on guidance to team members. Headed the design of complex ETL requirements and executed an Informatica-based solution that fulfilled demanding performance standards.
  • Collaborated with development teams and senior designers to establish architectural requirements, ensuring client satisfaction with the product.
  • Assessed requirements for completeness and accuracy, determined their actionability for the ETL team, and conducted impact assessments to estimate the effort size.
  • Developed full software development lifecycle (SDLC) project plans to implement ETL solution and identify resource requirements. Led the process of shaping and enhancing the overall ETL Informatica architecture.
  • Identified, recommended, and implemented ETL process and architecture improvements. Assisted and verified the design of solutions and production of all design phase deliverables.
  • Managed build phase and QA for code compliance with ETL standards, resolved complex design and development issues, and aligned the team with project goals.
  • Guided team discussions to practical conclusions, fostered positive group dynamics, and ensured adherence to specifications and standards.
  • Ensured the team was familiar with customer needs, specifications, design targets, development processes, design standards, techniques, and tools to support task execution.
Technologies: Informatica, PL/SQL, PL/SQL Tuning, Netezza, Unix Shell Scripting, ETL Tools, Data Engineering, Azure Synapse Analytics, Microsoft Power BI, Spark SQL, APIs, Azure Databricks, SQL, Big Data, Apache Kafka, AWS Lambda, Data Transformation, Data Integration, Financial Services, Technical Leadership, ETL, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Microsoft SQL Server, Teamwork, Database Architecture, Data Modeling, ELT, Apache Airflow, Terraform, Excel VBA, Azure Blob Storage API, Azure Data Lake, Azure Synapse, Informatica ETL, Informatica PowerCenter, Apache Spark, CI/CD Pipelines, T-SQL (Transact-SQL), Shell Scripting, GitHub Actions, Data Migration, Healthcare, Web Scraping, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Star Schema, Query Optimization

Senior Informatica Designer

2009 - 2014
IBM Netherlands | IBM India
  • Conducted functional knowledge transfer sessions with modelers. Led technical design meetings focusing on developing layer-specific strategies. Analyzed functional design documents and prepared analysis sheets for individual layers.
  • Created and revised technical design documents extensively to ensure alignment with the current release requirements.
  • Achieved 100% transition sign-off for all four releases, facilitated post-transition ramp-up, delivered projects in a steady state, and drove process enhancements. Cross-trained resources across all four iterations.
  • Identified team training needs, closed knowledge gaps through organized sessions, and earned accolades from clients and IBM, including monetary and certification awards.
Technologies: Informatica, IBM Db2, Oracle, Unix Shell Scripting, Autosys, Cognos TM1, ETL Tools, Unix, SQL, Data Transformation, Data Integration, Financial Services, ETL, Data Processing, Data, Data Analysis, Data Analytics, Teamwork, Database Architecture, ELT, Excel VBA, Informatica ETL, Informatica PowerCenter, CI/CD Pipelines, Shell Scripting, Healthcare, Data Architecture, Business Intelligence (BI), REST APIs, Query Optimization

Senior ETL Developer

2008 - 2009
Genisys Group
  • Developed type 2 dimension mapping for updates and inserts and crafted various Actuate reports, including drill-up and down, series, and parallel.
  • Designed Actuate-formatted reports for varied processes and developed dashboards tracking report statuses over multiple time frames.
  • Analyzed and developed reporting on the volume of generated, failed, scheduled, and pending reports.
Technologies: Informatica, Oracle, Unix Shell Scripting, ETL Tools, Unix, SQL, Data Transformation, Data Integration, ETL, Data Processing, Data, ELT, Informatica ETL, Informatica PowerCenter, Shell Scripting

Senior ETL Developer

2007 - 2008
Magna Infotech Pvt
  • Developed type 2 dimension mapping to update existing rows and insert new entries in target databases. Utilized Actuate to format a variety of process-related reports.
  • Created Actuate reports, including drill-up and down, series, and parallel for enhanced data analysis. Analyzed metrics on report generation, failures, scheduling, and queueing to improve reporting systems.
  • Developed dashboards tracking reports generated, failed, pending, and scheduled across varying time frames.
  • Gained hands-on experience in dimensional modeling and ETL design.
Technologies: Informatica, Unix Shell Scripting, Oracle, ETL Tools, Unix, SQL, Data Transformation, Data Integration, ETL, Data Processing, Data, Retail & Wholesale, ELT, Informatica ETL, Informatica PowerCenter, Shell Scripting

ETL Lead Developer

2005 - 2007
TechnoSpine Solutions
  • Created type 2 dimension mapping for data updates and new entries and formatted multi-process reports using Actuate.
  • Developed Actuate reports including drill-up and down, series, and parallel, analyzed report metrics, and tailored developments accordingly.
  • Designed dashboards to monitor and display report generation, failures, and scheduling on a time-based spectrum.
  • Acquired practical experience in dimensional modeling and ETL design and achieved relevant certifications and accomplishments.
Technologies: Informatica, Oracle, Autosys, ETL Tools, Unix, SQL, Data Transformation, Data Integration, ETL, Data Processing, Data, Retail & Wholesale, ELT, Informatica ETL, Informatica PowerCenter

Experience

ETL Optimization and Real-time Analytics Implementation

Acted as an Azure data engineer and optimized ETL pipelines, reducing processing time and costs. I implemented real-time analytics in Azure Databricks for actionable insights and performed seamless integration with Azure Data Services. I also established robust data governance and compliance measures and enhanced the performance of data processing workflows.

Data Pipeline Architecture and Process Improvement

Created and maintained optimal data pipeline architecture as a data engineer. I assisted with level 3 issue resolution, root cause analysis, and fixes for production environment availability while being available for rotational shift work to address data platform problems. Also, I assembled complex datasets meeting functional and non-functional requirements, led and coordinated 3rd-party and internal developers, database architects, data analysts, and data scientists on data initiatives, and identified, designed, and implemented internal process improvements. This included automating manual processes, optimizing data delivery, and redesigning infrastructure for scalability.

I built infrastructure for optimal data extraction, transformation, and loading from diverse sources using AWS or similar tech, developed analytics tools leveraging the data pipeline for actionable insights, and collaborated with stakeholders to address data-related technical issues and support infrastructure needs. Finally, I ensured data security and compliance across multiple regions, created data tools for analytics and data scientist teams to innovate product functionality, and collaborated with data experts to enhance functionality in data systems.

Tool Client Rate Desk

A web-based tool providing authoritative cash management pricing arrangements and contact information for the mid-market and large corporate client segments. The application is used by business contact centers, relationship managers, and cash management sales personnel. I served as an Azure data engineer and tech lead on this project.

Education

1998 - 2002

Bachelor's Degree in Technology and Electrical Engineering

Jawaharlal Nehru Technological University - Hyderabad, India

Certifications

SEPTEMBER 2023 - PRESENT

AWS Certified Cloud Practitioner

AWS

DECEMBER 2021 - PRESENT

Microsoft Certified: Azure Data Engineer

Microsoft

Skills

Libraries/APIs

PySpark, Azure Blob Storage API, REST APIs

Tools

Autosys, Microsoft Power BI, Spark SQL, AWS Glue, Amazon Athena, Apache Airflow, Informatica ETL, Informatica PowerCenter, Amazon Elastic MapReduce (EMR), Terraform

Languages

Python, Snowflake, SQL, Excel VBA, T-SQL (Transact-SQL), Scala

Frameworks

Spark, Apache Spark, Data Lakehouse, Hadoop

Paradigms

ETL, Business Intelligence (BI), Dimensional Modeling, DevOps

Platforms

Amazon Web Services (AWS), Azure, Databricks, Oracle, Unix, Azure Synapse Analytics, AWS Lambda, Azure Synapse, Microsoft Dynamics 365, Apache Kafka

Storage

SQL Stored Procedures, Data Pipelines, Amazon S3 (AWS S3), Redshift, Data Integration, Microsoft SQL Server, Database Architecture, Relational Databases, Azure Cosmos DB, PostgreSQL, SQL Server Integration Services (SSIS), MySQL, IBM Db2, PL/SQL, Netezza

Industry Expertise

Retail & Wholesale, Healthcare

Other

Data Engineering, Data Warehousing, ETL Tools, Informatica, Azure Data Factory (ADF), Azure Databricks, APIs, Big Data, Data Transformation, Big Data Architecture, Amazon RDS, Message Queues, Financial Services, Technical Leadership, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Large-scale Projects, Teamwork, Data Modeling, ELT, Data Extraction, Data Architecture, API Integration, Star Schema, Query Optimization, Data Build Tool (dbt), Azure Data Lake, Webhooks, CI/CD Pipelines, Shell Scripting, GitHub Actions, Data Migration, Data Scraping, Web Scraping, DAX, Unix Shell Scripting, Cognos TM1, PL/SQL Tuning, Azure Data Lake Analytics

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring