Goutham Kumar
Verified Expert in Engineering
Data Engineer and Developer
Ajax, ON, Canada
Toptal member since July 31, 2024
With 10+ years of data engineering experience, Goutham has crafted scalable solutions with AWS and Azure. He has developed Azure-based ETL pipelines for Bell Canada with Power BI and implemented an ETL pipeline with Azure Data Factory, Databricks, Snowflake, and Power BI for Walmart. Goutham also used Azure ML for predictive models for the National Bank of Canada, optimized data with Azure SQL and .NET, and created AWS data lake architectures with RDS, S3, Glue, and Tableau at HSBC.
Portfolio
Experience
- Microsoft Power BI - 10 years
- SQL - 10 years
- Python - 10 years
- Azure Databricks - 10 years
- Azure Data Factory - 10 years
- Data Engineering - 10 years
- Data Analytics - 10 years
- Tableau Desktop - 10 years
Availability
Preferred Environment
SQL, Azure Data Factory, Azure Databricks, Python, Snowflake, Microsoft Power BI, Azure SQL, Tableau, Data Warehousing, Data Analytics
The most amazing...
...thing I've implemented is use cases for data-driven product improvement and quality using Python, Spark, SQL, Tableau, Power BI, Snowflake, and cloud tech.
Work Experience
Senior Azure Data Engineer
National Bank of Canada
- Collaborated with product owners to define and translate business requirements into technical specifications. Developed solutions using ADF and Databricks, including migrating ETL pipelines to Databricks and dashboards using Power BI.
- Used Git for version control and Jira to track issues and bugs, resolving bugs related to data and the ETL pipeline on ADF. Gained a basic idea of large language models and Open AI.
- Built robust data pipelines, optimizing Spark applications and implementing distributed computing systems in the banking sector.
- Used Power Query, M language, and DAX to create insight-driven dashboards in Power BI.
- Migrated existing data pipelines in SSIS/Alteryx to ADF pipelines using Databricks. Also built new ones from scratch.
Data Engineering Consultant
HSBC Bank Canada
- Saved $10,000 by reconfiguring Azure Blob Storage from hot to cold tier. Diagnosed and fixed failures in ETL pipelines, ensuring accurate data ingestion. Created a data model alongside Power Query, DAX, and M language with RLS in Power BI dashboards.
- Optimized SQL queries and Spark jobs to reduce processing times, resolving delays in report generation and data processing workflows.
- Reviewed and refactored code regularly to improve readability and efficiency. Utilized autoscaling in Azure Databricks to dynamically adjust worker nodes based on workload demands, optimizing cost and performance.
- Built variables and new measures/columns using time intelligence and conditional DAX to achieve business requirements.
Data Platform Engineer
Walmart
- Reduced AWS costs by identifying and eliminating underutilized resources, such as Redshift and Kinesis, implementing cost-saving measures.
- Resolved issues with AWS Lambda functions and AWS Glue jobs that failed to ingest data from sources such as S3, Kinesis, or external APIs.
- Used a staging environment to develop and replace further data sources in the release process using user acceptance testing and a production environment.
- Fixed errors in Amazon Athena SQL queries caused by incorrect syntax or data schema mismatch. Improved query performance by optimizing partitioning and using appropriate data formats.
- Created amazing and insightful Tableau interactive dashboards for strategic decision-making with underlying data pipelines leveraging AWS Glue alongside Snowflake.
- Collaborated with product owners to define and translate business requirements into technical specifications. Used Snowflake tables to publish Tableau data sources and consumed those to build interactive dashboards using Parameter/Filter action and LOD.
- Used Jira to track issues and bugs, resolving bugs related to data and the user story as part of Agile scrums.
- Created interactivity in the summary of detailed view that led to a high number of uses in a short span of time, and dynamic insights helped stakeholders make useful decisions.
- Analyzed existing .unx/unv that had data models and relationships as well as BO reports/Xcelsius dashboards. Created corresponding Tableau dashboards using Custom SQL from Oracle and useful KPIs leveraging DATETRUNC, DATEPART, USERNAME, CONTAINS, functions, etc.
- Used Tableau performance recording for dashboard optimization and custom SQL optimization for smoother data extraction.
Data Engineer
Bell Canada
- Handled big data platform administration and engineering on multiple Hadoop, Kafka, HBase, and Spark clusters. Containerized nodes using Docker and managed deployment through Kubernetes.
- Resolved a major challenge in data ingestion into HDFS, Azure Storage, and Azure Data Lake. Created monitoring and alerting systems and worked with data source teams to improve data quality and availability, fixing inconsistent data in Hive tables.
- Created Hive queries and functions for evaluating, filtering, loading, and storing data. Performed transformations, cleaning, and filtering on imported data using Hive and MapReduce, loading final data into HDFS.
- Used Power BI with dataflow and dataset to consolidate data sources, and business requirements were achieved in dataflows and Power Query. Some were achieved with M-Language.
Experience
Optimizing Transaction Processing with Streamlined ETL Pipelines
Autoscaling for Banking Data Processing Workloads
Data Processing Pipeline for Retail Inventory Management
I used my understanding of legacy SAP BO dashboards and underlying .unx/.unv files to analyze table relationships among fact/dimension tables. I created a Tableau dashboard from scratch using the custom SQL query as DS, where we joined tables to bring in relevant attributes—used Table calculation String/Logical functions alongside date/time functions to create calculated fields, used sets for top N/bottom N functionality, parameters for sheet swapping, etc.
Discussions with stakeholders were done to understand their exact requirements, which were the conversion of BRD to a technical document, the building of calculated fields, various insightful charts like Waterfall charts for yearly growth in connections, network increment, dynamic insights based on filter selection, donut chart for regional percentage contributions, table calculations, sheet swapping, URL/filter actions for the summary to a detailed view of the dashboard, vertical/horizontal container for dashboard structuring, etc.
Big Data Platform Deployment for Telecommunications
Education
Bachelor's Degree in Computer Science
Rayalaseema University - India
Certifications
AWS Certified Developer Associate
Udemy
Microsoft Certified: Power BI Data Analyst Associate
Microsoft
Microsoft Certified: Azure Data Engineer Associate
Microsoft
Skills
Libraries/APIs
PySpark
Tools
Microsoft Power BI, Apache Airflow, BigQuery, Tableau Desktop, Power Query, Tableau, SSAS, Synapse, Talend ETL, Jira, Jenkins, Table Calculations, AWS Glue, Tableau Desktop Pro, Power BI Desktop
Languages
SQL, Python, Scala, Snowflake
Paradigms
ETL, Database Design, Agile
Platforms
Azure Synapse, Azure, Azure Synapse Analytics, Google Cloud Platform (GCP), Amazon Web Services (AWS), Databricks, Azure Event Hubs, AWS Lambda, Apache Kafka, Docker, Microsoft Fabric, Oracle, Amazon EC2
Storage
Azure SQL, Azure Storage, NoSQL, Azure Cosmos DB, PostgreSQL, Data Pipelines, Microsoft SQL Server, Database Architecture, MySQL, Amazon S3 (AWS S3), HBase, MongoDB, HDFS, Apache Hive
Frameworks
Spark, Hadoop, ADF
Other
Azure Data Factory, Azure Databricks, Azure Data Lake, Big Data, CI/CD Pipelines, Microsoft Azure, Data Engineering, Data Warehouse Design, Data Modeling, Data Visualization, Fivetran, Data Build Tool (dbt), Data Analytics, DAX, BI Reporting, Data Warehousing, Query Optimization, Data Analysis, Data Migration, Data Governance, Data Management, MDM, SSIS Custom Components, User Experience (UX), User Interface (UI), SAP, Data Architecture, ETL Development, Azure Blob Storage, Apache Spark Clusters, AWS Certified Developer, SAP BusinessObjects (BO), Tableau Server, M Language, Large Language Models (LLMs), OpenAI, Data Science
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring