Vidyasagara Reddy Thodime, Developer in Swindon, United Kingdom
Vidyasagara is available for hire
Hire Vidyasagara

Vidyasagara Reddy Thodime

Verified Expert  in Engineering

Bio

Vidya is a senior data engineer and leader with 16+ years of experience driving complex projects across Microsoft Azure, GCP, Palantir Foundry, and AWS. He is skilled in Azure Data Factory (ADF), Databricks, and Data Lake Storage, including data transformation using dbt and Snowflake and orchestration with Airflow. Vidya is proficient in distributed computing with Apache Spark, NiFi, and Hive, as well as in ETL tools like Informatica and Talend, and has solid expertise in Oracle and SQL Server.

Portfolio

Self-employed
Databricks, Palantir, Informatica ETL, Azure, Data Lakes, Data Build Tool (dbt)...
Capgemini
Azure Databricks, Apache Airflow, Azure Data Lake, Cloud Architecture...
Atos Syntel
Informatica ETL, Data Warehouse Design, IBM Db2, Oracle, SQL, ETL

Experience

  • Python - 16 years
  • SQL - 16 years
  • PySpark - 10 years
  • Azure Data Lake - 8 years
  • Azure - 8 years
  • Palantir - 6 years
  • Azure Databricks - 5 years
  • Data Build Tool (dbt) - 3 years

Availability

Full-time

Preferred Environment

PySpark, Azure, Databricks, Azure Data Lake, Azure Databricks, SQL, Informatica ETL, Python, Microsoft SQL Server, Palantir

The most amazing...

...thing I've achieved is streamline operations and save €5 million yearly by developing advanced analytics on Palantir Foundry and dynamic dashboards with Slate.

Work Experience

Senior Data Engineer

2023 - PRESENT
Self-employed
  • Built data pipelines using Palantir Foundry and leveraged Foundry’s ontology to create scalable data models, ensuring proper alignment of data structures.
  • Developed and maintained complex data pipelines in Palantir Foundry, ensuring seamless integration of structured and unstructured data from diverse sources.
  • Contributed to developing multiple transformation models as a senior data engineer for a dbt project.
  • Developed Databricks notebooks and dbt models according to the ELT pipeline specifications to load data into the raw zone and then move it to the curated zone by applying various transformations.
  • Created a PySpark script to aggregate and summarize the data before loading it into Delta Lake.
  • Oversaw the end-to-end migration of the source systems until the presentation layer.
  • Worked with Informatica and Azure Databricks to run Spark-Python notebooks through ADF pipelines.
  • Engaged extensively in PySpark performance-tuning to enhance pipeline efficiency.
  • Used Databricks widgets to pass runtime parameters from ADF to Databricks.
Technologies: Databricks, Palantir, Informatica ETL, Azure, Data Lakes, Data Build Tool (dbt), PySpark, SQL

Senior Consultant

2013 - 2023
Capgemini
  • Delivered end-to-end migration projects across different cloud providers.
  • Designed and developed data pipelines using Palantir Foundry.
  • Contributed to data migration projects, using Azure and GCP to transition data from data warehouse to data lake architecture.
  • Implemented key ingestion pipelines using dbt for a cloud data warehouse with dozens of data sources.
  • Leveraged expert data warehousing techniques and business intelligence concepts, including ETL processes.
  • Created an end-to-end data pipeline to fetch data from the source and cleanse, transform, and load it in Hadoop and public cloud environments.
  • Defined high-level design documents and transformed them into low-level design documents.
  • Engaged in the end-to-end implementation of a financial crime monitoring and anti-money laundering (AML) solution, which included data, technology, and DevOps architectures.
  • Designed and implemented scalable data pipelines using ADF for seamless data ingestion transformation and loading.
Technologies: Azure Databricks, Apache Airflow, Azure Data Lake, Cloud Architecture, Databricks, Data Build Tool (dbt), ETL, Informatica, Hadoop, Apache Hive, Solution Architecture, Google Cloud Platform (GCP), BigQuery, Data Engineering, Palantir, Foundary, Palantir Foundary

Senior Data Warehouse Engineer

2011 - 2013
Atos Syntel
  • Contributed to the end-to-end implementation of data warehouse projects.
  • Developed ETL mappings using Informatica PowerCenter.
  • Monitored jobs that were scheduled through the Control-M tool. Placed jobs on hold when a database or server was down, releasing them only once the server was up again.
  • Handled the end-to-end delivery of data warehouse designs, ensuring data was available for the business intelligence layer.
Technologies: Informatica ETL, Data Warehouse Design, IBM Db2, Oracle, SQL, ETL

Experience

€5 Million Annual Savings with Service Request Analytics

I worked on Palantir Foundry to develop and optimize data pipelines for processing critical flight-related data generated by onsite engineers at various airports. These data points, captured before each flight's takeoff, were ingested and processed using PySpark within the Palantir Foundry environment.

Additionally, I developed the integration with the service request analytics application, which processes and loads logged tickets into Palantir Foundry for further analysis. I also built a dynamic dashboard using Palantir Slate to enhance decision-making and operational efficiency. This dashboard enabled back-end engineers to quickly access historical ticket data, view previous solutions, and respond faster to ongoing issues.

This streamlined process significantly reduced response times and improved issue resolution, saving the company €5 million annually by optimizing resource allocation and minimizing delays.

Health Care Appointment Optimisation

I worked as a data engineer and developed and managed data pipelines in PySpark within Palantir Foundry to streamline patient appointment scheduling and optimize Operation Theater slot utilization for the NHS. I was responsible for processing and analyzing large datasets to identify patterns in appointment cancellations and resource availability. By designing efficient data workflows, I improved operational efficiency, reduced patient waiting times, and enhanced the overall healthcare experience. Collaborating with cross-functional teams, I ensured that data was accessible, accurate, and actionable for informed decision-making.

Financial Crime Monitoring and AML Solution

I developed a financial crime monitoring and AML solution using Azure and its suite of technologies. The system detects, monitors, and reports suspicious financial activities in real-time.

The solution integrates machine learning models for anomaly detection and uses Azure Data Lake to store large financial datasets scalablely. Azure Service Bus enables seamless, reliable communication between different system components, ensuring data flows smoothly across the solution. Azure Functions are also used for serverless computation, triggering automated responses to specific events, such as suspicious transactions. Meanwhile, Azure Logic Apps automate and orchestrate the process of generating and submitting compliance reports, reducing manual effort.

By combining these technologies, the solution enabled rapid detection of financial crimes, enhanced the accuracy of AML monitoring, and ensured compliance with regulatory requirements, all while reducing operational overhead and improving response times.

Education

2007 - 2009

Master's Degree in Computer Science

Jawaharlal Nehru Technological University - Hyderabad, India

2002 - 2006

Bachelor's Degree in Information Technology

Jawaharlal Nehru Technological University - Hyderabad, India

Certifications

SEPTEMBER 2022 - SEPTEMBER 2024

Google Cloud Certified Professional Data Engineer

Google Cloud

AUGUST 2022 - AUGUST 2024

Google Cloud Certified Professional Cloud DevOps Engineer

Google Cloud

JULY 2022 - JULY 2024

Google Cloud Certified Professional Cloud Architect

Google Cloud

FEBRUARY 2022 - PRESENT

The Open Group Certified: TOGAF 9 Certified

The Open Group

FEBRUARY 2021 - FEBRUARY 2023

Microsoft Certified: Azure Solutions Architect Expert

Microsoft

APRIL 2018 - PRESENT

Palantir Foundry

Palantir

Skills

Libraries/APIs

PySpark

Tools

Informatica ETL, Apache Airflow, BigQuery

Languages

SQL, Python

Platforms

Azure, Databricks, Oracle, Google Cloud Platform (GCP), Microsoft

Storage

Microsoft SQL Server, Data Lakes, Apache Hive, IBM Db2, Google Cloud

Frameworks

Hadoop, TOGAF

Paradigms

Distributed Computing, ETL

Other

Palantir, Azure Data Lake, Azure Databricks, Data Build Tool (dbt), Software Engineering, Cloud Computing, Software, Slate, Enterprise Architecture, DevOps Engineer, Cloud Architecture, Informatica, Solution Architecture, Data Engineering, Data Warehouse Design, Azure Service Bus, Log Analytics, Google, Microsoft Azure, Foundary, Palantir Foundary

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring