Satyanarayana Annepogu
Verified Expert in Engineering
Database Developer
Toronto, ON, Canada
Toptal member since October 25, 2022
Satya is a senior data engineer with over 15 years of IT experience designing and developing data warehouses for banking and insurance clients. He specializes in designing and building modern data pipelines and streams using AWS and Azure Data engineering stack. Satya is an expert in delivering modernization of enterprise data solutions using AWS and Azure cloud data technologies.
Portfolio
Experience
- Databricks - 5 years
- Spark - 5 years
- Azure - 4 years
- PySpark - 4 years
- Apache Airflow - 3 years
- Redshift - 3 years
- Amazon Web Services (AWS) - 3 years
- Snowflake - 2 years
Availability
Preferred Environment
Data Engineering, Amazon Web Services (AWS), Azure, Databricks, Python, PySpark, Hadoop, Snowflake, Data Warehousing, ETL Tools, Relational Databases, Data Extraction, Data Architecture, Data Lakehouse, API Integration, Business Intelligence (BI), REST APIs, Dimensional Modeling, Query Optimization
The most amazing...
...project I've done is designing, developing, and supporting cloud-based and traditional data warehouse applications.
Work Experience
Data Engineer
Millicom
- Led the implementation of AWS Glue for automated ETL processes, reducing data processing time and improving data accuracy for telecom network performance data, customer interactions, and billing information.
- Utilized AWS Lambda functions to develop serverless data pipelines, facilitating seamless integration between CRM systems, network infrastructure, IoT devices, and external sources within the telecom ecosystem.
- Architected solutions using Amazon S3 (AWS S3) to optimize data storage and retrieval, implementing cost-effective and scalable data lakes to accommodate large volumes of network performance data, customer interactions, and operational metrics.
- Orchestrated complex workflows using AWS Step Functions, ensuring efficient coordination and execution of multi-step data processing tasks for proactive network health monitoring and dynamic service provisioning.
- Leveraged Amazon Redshift as a data warehousing solution, enabling high-performance analytical queries to derive actionable insights into network performance, customer behavior, and service usage patterns.
- Integrated AWS Data Pipeline to automate data movement and transformation, streamlining operational processes and enhancing data availability for real-time decision-making and strategic planning.
- Implemented robust security measures using AWS Identity and Access Management (IAM) and Amazon VPC, ensuring data privacy and regulatory compliance for sensitive network performance data, customer interactions, and billing information.
- Leveraged AWS Lambda functions to create serverless data pipelines, ensuring seamless integration between disparate systems and services within the BN Bank Norway ecosystem.
- Optimized data storage and retrieval by architecting solutions using Amazon S3, implementing cost-effective and scalable data lakes to accommodate the bank's growing data volumes.
Data Analyst
Heimstaden
- Designed and developed data ingestion pipelines using ADF and a processing layer using Databricks notebooks with PySpark.
- Oversaw the planning, design, development, testing, implementation, documentation, and support of data pipelines.
- Paused and resumed Azure SQL Data Warehouse using ADF. Developed multiple ADF pipelines with business rules applied as reusable assets.
- Utilized Azure Key Vault to store connection string details and certificates and incorporated these vaults in Azure Data Factory to create linked services. Orchestrated and automated the pipelines.
- Implemented slowly changing dimensions type 1 and 2 (SCD1 and SCD2). Processed daily, weekly, and monthly batches. Created POCs with Apache Spark using PySpark and Spark SQL to address diverse, complex data transformation needs.
Azure Data Engineer and Tech Lead
IBM
- Created and built data ingestion pipelines using ADF and a processing layer using Databricks notebooks with PySpark. Planned, designed, developed, tested, implemented, documented, and supported data pipelines.
- Paused and resumed Azure SQL Data Warehouse using ADF. Developed multiple ADF pipelines with business rules applied as reusable assets. Ingested CSV, fixed-width, and Excel files.
- Automated pipeline failure email notifications through web activity. Utilized Azure Key Vault to store connection string details and certificates for Azure Data Factory linked services.
- Orchestrated and automated pipelines and developed POCs with Apache Spark using PySpark and Spark SQL for complex data transformations. Employed PowerShell scripts to automate the pipelines.
- Collaborated with the client's and IBM's ETL teams, analyzed on-premise Informatica ETL solutions, and formulated an ETL solution leveraging Azure Data Factory pipelines, Azure Databricks, PySpark, and Spark SQL.
- Optimized performance of pipelines in Azure Data Factory and Azure Databricks.
Team Lead and Senior ETL Consultant
IBM India
- Developed solutions in a high-pressure environment and provided hands-on guidance to team members. Headed the design of complex ETL requirements and executed an Informatica-based solution that fulfilled demanding performance standards.
- Collaborated with development teams and senior designers to establish architectural requirements, ensuring client satisfaction with the product.
- Assessed requirements for completeness and accuracy, determined their actionability for the ETL team, and conducted impact assessments to estimate the effort size.
- Developed full software development lifecycle (SDLC) project plans to implement ETL solution and identify resource requirements. Led the process of shaping and enhancing the overall ETL Informatica architecture.
- Identified, recommended, and implemented ETL process and architecture improvements. Assisted and verified the design of solutions and production of all design phase deliverables.
- Managed build phase and QA for code compliance with ETL standards, resolved complex design and development issues, and aligned the team with project goals.
- Guided team discussions to practical conclusions, fostered positive group dynamics, and ensured adherence to specifications and standards.
- Ensured the team was familiar with customer needs, specifications, design targets, development processes, design standards, techniques, and tools to support task execution.
Senior Informatica Designer
IBM Netherlands | IBM India
- Conducted functional knowledge transfer sessions with modelers. Led technical design meetings focusing on developing layer-specific strategies. Analyzed functional design documents and prepared analysis sheets for individual layers.
- Created and revised technical design documents extensively to ensure alignment with the current release requirements.
- Achieved 100% transition sign-off for all four releases, facilitated post-transition ramp-up, delivered projects in a steady state, and drove process enhancements. Cross-trained resources across all four iterations.
- Identified team training needs, closed knowledge gaps through organized sessions, and earned accolades from clients and IBM, including monetary and certification awards.
Senior ETL Developer
Genisys Group
- Developed type 2 dimension mapping for updates and inserts and crafted various Actuate reports, including drill-up and down, series, and parallel.
- Designed Actuate-formatted reports for varied processes and developed dashboards tracking report statuses over multiple time frames.
- Analyzed and developed reporting on the volume of generated, failed, scheduled, and pending reports.
Senior ETL Developer
Magna Infotech Pvt
- Developed type 2 dimension mapping to update existing rows and insert new entries in target databases. Utilized Actuate to format a variety of process-related reports.
- Created Actuate reports, including drill-up and down, series, and parallel for enhanced data analysis. Analyzed metrics on report generation, failures, scheduling, and queueing to improve reporting systems.
- Developed dashboards tracking reports generated, failed, pending, and scheduled across varying time frames.
- Gained hands-on experience in dimensional modeling and ETL design.
ETL Lead Developer
TechnoSpine Solutions
- Created type 2 dimension mapping for data updates and new entries and formatted multi-process reports using Actuate.
- Developed Actuate reports including drill-up and down, series, and parallel, analyzed report metrics, and tailored developments accordingly.
- Designed dashboards to monitor and display report generation, failures, and scheduling on a time-based spectrum.
- Acquired practical experience in dimensional modeling and ETL design and achieved relevant certifications and accomplishments.
Experience
ETL Optimization and Real-time Analytics Implementation
Data Pipeline Architecture and Process Improvement
I built infrastructure for optimal data extraction, transformation, and loading from diverse sources using AWS or similar tech, developed analytics tools leveraging the data pipeline for actionable insights, and collaborated with stakeholders to address data-related technical issues and support infrastructure needs. Finally, I ensured data security and compliance across multiple regions, created data tools for analytics and data scientist teams to innovate product functionality, and collaborated with data experts to enhance functionality in data systems.
Tool Client Rate Desk
Education
Bachelor's Degree in Technology and Electrical Engineering
Jawaharlal Nehru Technological University - Hyderabad, India
Certifications
AWS Certified Cloud Practitioner
AWS
Microsoft Certified: Azure Data Engineer
Microsoft
Skills
Libraries/APIs
PySpark, Azure Blob Storage API, REST APIs
Tools
Autosys, Microsoft Power BI, Spark SQL, AWS Glue, Amazon Athena, Apache Airflow, Informatica ETL, Informatica PowerCenter, Amazon Elastic MapReduce (EMR), Terraform
Languages
Python, Snowflake, SQL, Excel VBA, T-SQL (Transact-SQL), Scala
Frameworks
Spark, Apache Spark, Data Lakehouse, Hadoop
Paradigms
ETL, Business Intelligence (BI), Dimensional Modeling, DevOps
Platforms
Amazon Web Services (AWS), Azure, Databricks, Oracle, Unix, Azure Synapse Analytics, AWS Lambda, Azure Synapse, Microsoft Dynamics 365, Apache Kafka
Storage
SQL Stored Procedures, Data Pipelines, Amazon S3 (AWS S3), Redshift, Data Integration, Microsoft SQL Server, Database Architecture, Relational Databases, Azure Cosmos DB, PostgreSQL, SQL Server Integration Services (SSIS), MySQL, IBM Db2, PL/SQL, Netezza
Industry Expertise
Retail & Wholesale, Healthcare
Other
Data Engineering, Data Warehousing, ETL Tools, Informatica, Azure Data Factory (ADF), Azure Databricks, APIs, Big Data, Data Transformation, Big Data Architecture, Amazon RDS, Message Queues, Financial Services, Technical Leadership, Data Processing, Data, Data Analysis, Data Analytics, Data Visualization, Large-scale Projects, Teamwork, Data Modeling, ELT, Data Extraction, Data Architecture, API Integration, Star Schema, Query Optimization, Data Build Tool (dbt), Azure Data Lake, Webhooks, CI/CD Pipelines, Shell Scripting, GitHub Actions, Data Migration, Data Scraping, Web Scraping, DAX, Unix Shell Scripting, Cognos TM1, PL/SQL Tuning, Azure Data Lake Analytics
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring