Ayushi Bhardwaj, Developer in Berlin, Germany
Ayushi is available for hire
Hire Ayushi

Ayushi Bhardwaj

Verified Expert  in Engineering

Bio

Ayushi is a data engineer with six years of professional experience building data solutions on the cloud. She has expertise in data acquisition, streaming and batch data processing, data migration, software development, and quality assurance. With a strong technical background in Python, SQL, Apache Airflow, and AWS, Ayushi's attention to detail and strategic thinking make her a valuable asset to any team.

Portfolio

Dream11
Python, SQL, Software Deployment, Data Engineering, ETL, ELT, Data Warehousing...
Amazon India
Python, SQL, AWS Lambda, Data Engineering, Data Processing, ELT...
General Electric
Data Acquisition (DAQ), Data Modeling, Agile, PostgreSQL, Amazon EC2...

Experience

  • Data Engineering - 6 years
  • Python - 6 years
  • SQL - 6 years
  • ETL - 6 years
  • Amazon Web Services (AWS) - 6 years
  • Redshift - 6 years
  • Apache Airflow - 3 years
  • Apache Iceberg - 2 years

Availability

Part-time

Preferred Environment

MacOS, PyCharm, DBeaver, Sublime Text, GitHub, Slack, IntelliJ IDEA, Agile, Scrum, Linux

The most amazing...

...solution I've built is a self-serve data integration platform bringing data from 80+ sources to a central lakehouse for an app with 200 million users.

Work Experience

Senior Data Engineer

2022 - 2024
Dream11
  • Designed and developed a self-serve platform for data integration from different sources to a data lake and warehouse, reducing pipeline creation time by 95% and moving 87% of the weekly workload to the platform to reduce manual effort and time.
  • Led a team of four to plan and migrate 800+ transactional pipelines and 4,000+ tables from a warehouse to a lakehouse ecosystem.
  • Developed a Python utility-based count QC system leveraging MySQL, Apache Iceberg, and Redshift with retry, alerts, and auto-healing, managing to perform quality assurance checks for 300 tables within eight minutes every hour.
  • Delivered multiple reusable Python utilities for data acquisition/ETL from Aerospike, Cassandra, and MySQL, reducing redundant codebase by up to 50%.
  • Replaced cron scheduling with dynamic DAG-based Apache Airflow solutions, providing more reliable and flexible scheduling.
  • Onboarded 100+ jobs with a rich UI for monitoring and retrying tasks.
Technologies: Python, SQL, Software Deployment, Data Engineering, ETL, ELT, Data Warehousing, Data Integration, Data Processing, Apache Kafka, Spark, Apache Airflow, Datadog, Apache Iceberg, Redshift, Amazon EC2, Amazon S3 (AWS S3), Data Lakes, Data Lakehouse, Amazon Web Services (AWS), Data Migration, Data Analysis, Query Optimization, Snowflake, Back-end Development, Infrastructure as Code (IaC), DevOps, Java

Business Intelligence Engineer

2021 - 2022
Amazon India
  • Managed end-to-end analytics for last-mile operations of heavy and bulky goods, using complex SQL and visualization tools to empower data-driven decision-making for business leaders.
  • Provided analytical support and created 100+ business critical metrics for different products covering business functions of payments, compliance, routing, performance, and security for the Mexican and Indian markets.
  • Implemented a portal utility for external partner reports as an event-driven AWS pipeline to integrate reports into inter-account portal buckets.
Technologies: Python, SQL, AWS Lambda, Data Engineering, Data Processing, ELT, Data Warehousing, Amazon EC2, Amazon S3 (AWS S3), Redshift, Data Modeling, Agile, PostgreSQL, Amazon Web Services (AWS), Data Lakes, ETL, Data Analytics, Data Migration, Data Analysis, Query Optimization, Infrastructure as Code (IaC)

Data Engineering Specialist

2018 - 2021
General Electric
  • Developed data ingestion pipelines supporting 75+ products and worked with scalable data design patterns, including batch and change data capture (CDC).
  • Redesigned high-volume replication (HVR) data pipelines for 2,000+ tables from four critical ERP systems, reducing delivery latency by up to 86% and improving data quality that impacted 12 business products.
  • Performed tech comparison for ACID transactions on Amazon S3 (AWS S3) between Delta Lake and Apache Hudi.
  • Built a secure AWS-based data ingestion framework with test-driven development and 90% coverage using Pytest and AWS CodeBuild.
  • Implemented a data science solution using R, which provided 82% accurate results and increased cash flow by predicting potential past-due invoices of $46 million in a quarter.
Technologies: Data Acquisition (DAQ), Data Modeling, Agile, PostgreSQL, Amazon EC2, Apache Hudi, Amazon Kinesis, Amazon CloudWatch, PySpark, Python, SQL, Data Engineering, Amazon Web Services (AWS), Data Lakes, ACID, Delta Lake, Amazon S3 (AWS S3), AWS CodeBuild, Pytest, ETL, Data Pipelines, Data Warehousing, Data Migration, Data Science, Data Analysis, Back-end Development, Infrastructure as Code (IaC), DevOps

Experience

Self-serve Data Pipelines for Dream Sports

https://tech.dream11.in/blog/data-beam-self-serve-data-pipelines-at-dream-sports
A self-serve data integration platform for which I managed the architecture design and development as a senior developer. I divided the pipeline creation process into small modules of data acquisition, cleaning, loading, quality checking, alerting, and monitoring. In addition, I developed some APIs for the back-end service and Python scripts for setup and quality checks. I also introduced airflow dynamic DAGs for pipeline orchestration. This platform helped reduce 90% of manual intervention while creating new data pipelines.

Data Quality Check Framework

A quality check framework for data pipelines. I developed a Python utility for count-based quality checks of relational data. I built an adaptive design with the feature to calculate a different sliding window for each table based on the load schedule, load type, and product. I also added Slack alerting capabilities for proactive resolution of any mismatch.

Education

2014 - 2018

Bachelor's Degree in Instrumentation and Control Engineering

University of Delhi - Delhi, India

Certifications

AUGUST 2019 - AUGUST 2022

AWS Solutions Architect Associate

AWS

FEBRUARY 2018 - PRESENT

Data Structures and Algorithms

Coding Blocks

Skills

Libraries/APIs

PySpark, Liquibase, Datadog API

Tools

Apache Airflow, Apache Iceberg, PyCharm, GitHub, Amazon CloudWatch, Kafka Connect, AWS Glue, Amazon Athena, AWS IAM, Sublime Text, Slack, IntelliJ IDEA, AWS CodeBuild, Pytest

Languages

Python, SQL, C++, Snowflake, Java

Platforms

Amazon EC2, Apache Kafka, Amazon Web Services (AWS), AWS Cloud Computing Services, AWS Lambda, Apache Hudi, Docker, MacOS, Linux

Storage

Data Integration, Redshift, Amazon S3 (AWS S3), MySQL, Datadog, Data Lakes, PostgreSQL, DBeaver, Data Pipelines, Amazon DynamoDB

Frameworks

Data Lakehouse, Spark, Spring

Paradigms

ETL, Agile, DevOps, Scrum, ACID

Other

Data Engineering, Software Deployment, Data Processing, ELT, Data Warehousing, Data Acquisition (DAQ), Data Modeling, Amazon RDS, Quality Control (QC), Coding, Complex Problem Solving, Data Migration, Data Analysis, Query Optimization, Back-end Development, Amazon Kinesis, Algorithms, Data Structures, Delta Lake, Control Engineering, Instrumentation, Data Analytics, Data Science, Infrastructure as Code (IaC)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring