Pappu Yadav
Verified Expert in Engineering
Software Engineer and Developer
Gurugram, Haryana, India
Toptal member since June 14, 2022
Pappu is a software engineer with skills in BigData, Spark, Hadoop, Hive, and Presto DB. He focuses on building ETL frameworks to process large amounts of data in batches and in real time with Spark Streaming. With his expertise in optimizing Spark jobs and experience writing complex SQL queries for debugging and analysis, Pappu builds platform tools used by various business teams and REST APIs in Java using frameworks like Spring and Hibernate.
Portfolio
Experience
- Back-end Development - 7 years
- Java - 7 years
- Databases - 7 years
- MySQL - 7 years
- Hadoop - 4 years
- Apache Spark - 4 years
- Big Data - 4 years
- Spark - 4 years
Availability
Preferred Environment
MacOS, Linux
The most amazing...
...thing I've developed from scratch is an ETL Spark framework, enabling teams in the organization to run their ETL jobs without worrying about complexities.
Work Experience
Technical Lead
Airtel India
- Developed a framework that enabled various business teams to run any ETL job on top of Spark and Spark Streaming.
- Changed the Presto code to remove the data duplication issue on the query engine.
- Developed a real-time reconciliation framework that performed reconciliation on any number of input sources based on configured rules in real time.
Senior Software Engineer
Mobileum
- Created an ETL pipeline from scratch to process real-time data using Spark Streaming.
- Tracked in real time users roaming outside the country and assigned scores for each user based on the quality of calls, SMS, and data. Used Spark Streaming to manage trips of users.
- Developed a framework to overcome small file problems in Spark Streaming job using custom compaction of files written over Hadoop.
Senior Software Engineer
MobiKwik
- Built a cab booking API to enable cab booking from the mobile app without having to install the app. Integrated it with a cab aggregator Ola in the back end for the actual cab booking and ride tracking.
- Developed the bike rental booking APIs to enable users to rent bikes within the app. Integrated it with a bike rental aggregator in the back end for the actual booking.
- Built a booking cancellation flow in the hotel booking module, enabling the users to cancel a hotel booking, and integrated the APIs provided by the hotel aggregator.
Software Engineer
PayU India
- Developed a credit card app that enabled users to track transaction history, enable or disable the card, and approve transactions with in-app notification.
- Made changes in the dashboards that track payment transactions' health in real time.
- Developed a payment flow that bypassed the payment gateway and integrated directly with the bank to facilitate user payments.
Experience
Rule Engine
Rules can be point query or aggregated rule types, which aggregate data for the configured time.
The whole framework is built using the Spark Streaming engine.
Spark Batch and Streaming ETL Framework
Job is configured using a JSON file in which users can configure source and sink, define input format, location, and other relevant information.
Source and sink configurations are maintained in different Postgres databases.
The framework can be used across the organization.
Generic Recon Framework
Users can define configurable rules on the basis on which records can be grouped later.
The framework also has the capability for late arrival events, including state management of records within Spark memory.
Users can also view the partial reconciled records, and later they can be moved to reconciled.
Education
Bachelor's Degree in Computer Science
Delhi Technological University - New Delhi, India
Skills
Tools
Git, Apache Airflow, Cloudera, RabbitMQ
Languages
Java, SQL
Frameworks
Spark, Spark Structured Streaming, Presto, Hadoop, Apache Spark
Storage
Databases, Apache Hive, MySQL, PostgreSQL, MongoDB
Paradigms
ETL, Database Design
Platforms
Apache Hudi
Other
API Documentation, Back-end Development, OOP Designs, Data Engineering, Big Data, Frameworks, Springbot, Data Migration
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring