
Vidyasagara Reddy Thodime
Verified Expert in Engineering
Data Engineer and Developer
Swindon, United Kingdom
Toptal member since December 3, 2024
Vidya is a senior data engineer and leader with 16+ years of experience driving complex projects across Microsoft Azure, GCP, Palantir Foundry, and AWS. He is skilled in Azure Data Factory (ADF), Databricks, and Data Lake Storage, including data transformation using dbt and Snowflake and orchestration with Airflow. Vidya is proficient in distributed computing with Apache Spark, NiFi, and Hive, as well as in ETL tools like Informatica and Talend, and has solid expertise in Oracle and SQL Server.
Portfolio
Experience
- Python - 16 years
- SQL - 16 years
- PySpark - 10 years
- Azure Data Lake - 8 years
- Azure - 8 years
- Palantir - 6 years
- Azure Databricks - 5 years
- Data Build Tool (dbt) - 3 years
Availability
Preferred Environment
PySpark, Azure, Databricks, Azure Data Lake, Azure Databricks, SQL, Informatica ETL, Python, Microsoft SQL Server, Palantir
The most amazing...
...thing I've achieved is streamline operations and save €5 million yearly by developing advanced analytics on Palantir Foundry and dynamic dashboards with Slate.
Work Experience
Senior Data Engineer
Self-employed
- Built data pipelines using Palantir Foundry and leveraged Foundry’s ontology to create scalable data models, ensuring proper alignment of data structures.
- Developed and maintained complex data pipelines in Palantir Foundry, ensuring seamless integration of structured and unstructured data from diverse sources.
- Contributed to developing multiple transformation models as a senior data engineer for a dbt project.
- Developed Databricks notebooks and dbt models according to the ELT pipeline specifications to load data into the raw zone and then move it to the curated zone by applying various transformations.
- Created a PySpark script to aggregate and summarize the data before loading it into Delta Lake.
- Oversaw the end-to-end migration of the source systems until the presentation layer.
- Worked with Informatica and Azure Databricks to run Spark-Python notebooks through ADF pipelines.
- Engaged extensively in PySpark performance-tuning to enhance pipeline efficiency.
- Used Databricks widgets to pass runtime parameters from ADF to Databricks.
Senior Consultant
Capgemini
- Delivered end-to-end migration projects across different cloud providers.
- Designed and developed data pipelines using Palantir Foundry.
- Contributed to data migration projects, using Azure and GCP to transition data from data warehouse to data lake architecture.
- Implemented key ingestion pipelines using dbt for a cloud data warehouse with dozens of data sources.
- Leveraged expert data warehousing techniques and business intelligence concepts, including ETL processes.
- Created an end-to-end data pipeline to fetch data from the source and cleanse, transform, and load it in Hadoop and public cloud environments.
- Defined high-level design documents and transformed them into low-level design documents.
- Engaged in the end-to-end implementation of a financial crime monitoring and anti-money laundering (AML) solution, which included data, technology, and DevOps architectures.
- Designed and implemented scalable data pipelines using ADF for seamless data ingestion transformation and loading.
Senior Data Warehouse Engineer
Atos Syntel
- Contributed to the end-to-end implementation of data warehouse projects.
- Developed ETL mappings using Informatica PowerCenter.
- Monitored jobs that were scheduled through the Control-M tool. Placed jobs on hold when a database or server was down, releasing them only once the server was up again.
- Handled the end-to-end delivery of data warehouse designs, ensuring data was available for the business intelligence layer.
Experience
€5 Million Annual Savings with Service Request Analytics
Additionally, I developed the integration with the service request analytics application, which processes and loads logged tickets into Palantir Foundry for further analysis. I also built a dynamic dashboard using Palantir Slate to enhance decision-making and operational efficiency. This dashboard enabled back-end engineers to quickly access historical ticket data, view previous solutions, and respond faster to ongoing issues.
This streamlined process significantly reduced response times and improved issue resolution, saving the company €5 million annually by optimizing resource allocation and minimizing delays.
Health Care Appointment Optimisation
Financial Crime Monitoring and AML Solution
The solution integrates machine learning models for anomaly detection and uses Azure Data Lake to store large financial datasets scalablely. Azure Service Bus enables seamless, reliable communication between different system components, ensuring data flows smoothly across the solution. Azure Functions are also used for serverless computation, triggering automated responses to specific events, such as suspicious transactions. Meanwhile, Azure Logic Apps automate and orchestrate the process of generating and submitting compliance reports, reducing manual effort.
By combining these technologies, the solution enabled rapid detection of financial crimes, enhanced the accuracy of AML monitoring, and ensured compliance with regulatory requirements, all while reducing operational overhead and improving response times.
Education
Master's Degree in Computer Science
Jawaharlal Nehru Technological University - Hyderabad, India
Bachelor's Degree in Information Technology
Jawaharlal Nehru Technological University - Hyderabad, India
Certifications
Google Cloud Certified Professional Data Engineer
Google Cloud
Google Cloud Certified Professional Cloud DevOps Engineer
Google Cloud
Google Cloud Certified Professional Cloud Architect
Google Cloud
The Open Group Certified: TOGAF 9 Certified
The Open Group
Microsoft Certified: Azure Solutions Architect Expert
Microsoft
Palantir Foundry
Palantir
Skills
Libraries/APIs
PySpark
Tools
Informatica ETL, Apache Airflow, BigQuery
Languages
SQL, Python
Platforms
Azure, Databricks, Oracle, Google Cloud Platform (GCP), Microsoft
Storage
Microsoft SQL Server, Data Lakes, Apache Hive, IBM Db2, Google Cloud
Frameworks
Hadoop, TOGAF
Paradigms
Distributed Computing, ETL
Other
Palantir, Azure Data Lake, Azure Databricks, Data Build Tool (dbt), Software Engineering, Cloud Computing, Software, Slate, Enterprise Architecture, DevOps Engineer, Cloud Architecture, Informatica, Solution Architecture, Data Engineering, Data Warehouse Design, Azure Service Bus, Log Analytics, Google, Microsoft Azure, Foundary, Palantir Foundary
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring