Priyanshu Bahuguna, Developer in Noida, Uttar Pradesh, India
Priyanshu is available for hire
Hire Priyanshu

Priyanshu Bahuguna

Verified Expert  in Engineering

Data Engineer and Developer

Noida, Uttar Pradesh, India

Toptal member since April 3, 2024

Bio

Priyanshu boasts over 18 years of experience in software development, where he has managed projects within multi-cloud environments, leading the delivery of data pipelines on Google Cloud Platform (GCP) and AWS while collaborating with global stakeholders. Priyanshu's accomplishments include achieving a 1.70% savings in efforts after implementing a data pipeline on AWS and delivering a 2.60% faster execution on an end-to-end data pipeline implemented on GCP for data curation and ML modeling.

Portfolio

Cisco
Python 3, Google Cloud Platform (GCP), Amazon Web Services (AWS)...
HCL Technologies
Java, Oracle ATG Commerce, Kubeflow, PySpark, SQL, Python 3...
Infosys
Java, SQL, Databases, APIs

Experience

  • SQL - 10 years
  • Amazon Web Services (AWS) - 5 years
  • Snowflake - 5 years
  • ETL Implementation & Design - 5 years
  • Python 3 - 5 years
  • Google Cloud Platform (GCP) - 5 years
  • Data Quality Governance - 3 years
  • Data Build Tool (dbt) - 3 years

Availability

Full-time

Preferred Environment

Jupyter Notebook

The most amazing...

...project I've worked on involved migrating an on-prem system to the GCP data platform, reducing overall runtime to 60% while scaling the data volumes to 1.5x.

Work Experience

Lead Data Engineer

2021 - PRESENT
Cisco
  • Designed and implemented medallion data architecture for curating telemetry data and deploying machine learning (ML) models on top of that.
  • Tracked defects for the United States in Jira. Became well-versed with Agile methodologies.
  • Reduced runtime of the end-to-end (E2E) data pipeline by 80% and achieved 70% savings in efforts for manual data investigation.
  • Demonstrated expertise in designing and implementing high-performance, reusable, and scalable data models.
  • Leveraged cutting-edge technologies and frameworks with a proven ability to lead teams in tackling even the most challenging problems head-on.
  • Implemented an end-to-end data pipeline utilizing cloud-native technologies on AWS, automating the ingestion and enrichment of telemetry data, ML models, and summarization, reducing the time required to prepare vulnerability reports by 70%.
Technologies: Python 3, Google Cloud Platform (GCP), Amazon Web Services (AWS), Data Build Tool (dbt), Snowflake, ETL Implementation & Design, Data Quality Governance, Stakeholder Engagement, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Migration, Python, Terraform

Senior Technical Lead

2011 - 2021
HCL Technologies
  • Spearheaded the development and productization of machine learning model pipelines using Kubeflow on Google Kubernetes Engine (GKE), streamlining machine learning (ML) model deployment and management.
  • Tracked and fixed bugs in Jira and utilized Agile methodologies.
  • Leveraged operators within Kubeflow for distributed training and hyperparameter tuning, enhancing model performance and accuracy. Deployed machine learning (ML) models as scalable services within the Kubeflow environment.
  • Managed a team of engineers dedicated to enhancing and supporting the eCommerce checkout/catalog modules. Collaborated closely with L1 and L2 teams to prioritize and resolve production incidents within agreed Service Level Agreements (SLA).
  • Designed and implemented an Airflow-based framework for managing directed acyclic graph (DAG) dependencies, enabling efficient manual DAG execution and seamless workflow orchestration.
  • Oversaw the operational aspects of a prominent mobility player's production eCommerce platform.
Technologies: Java, Oracle ATG Commerce, Kubeflow, PySpark, SQL, Python 3, Google Kubernetes Engine (GKE), Python

Technical Lead

2006 - 2011
Infosys
  • Led the development and oversight of core banking solutions for a premier financial institution in the United States.
  • Spearheaded the development of innovative features to augment existing banking solutions, enhancing functionality and user experience.
  • Orchestrated the seamless migration of applications from legacy technology frameworks to contemporary Java frameworks, ensuring improved performance, scalability, and maintainability.
  • Oversaw the production support of critical banking services, ensuring key financial systems' continuous operation and stability.
Technologies: Java, SQL, Databases, APIs

Experience

Syslog Data Pipeline

I worked on curating telemetry data from customer devices and enriched it with identity card (IC) signatures to classify syslogs as critical or non-critical. I created the medallion architecture for the data warehouse to ingest and transform the data and deploy machine learning models using Vertex AI to classify the syslog.

I architected the E2E data pipeline using serverless offerings to curate the data and chose a medallion architecture for designing a data warehouse after looking at the use cases. The goal was to provide transformed data to ML models and allow other consumers to access the data for their use cases.
In the medallion architecture, the data is segregated into bronze/silver and gold layers, with each subsequent layer providing more enriched data.

Stakeholders can then choose which degree of enrichment they want to consume. Data ingestion was done through files pushed to landing zones from where it was consumed by Dataflow jobs. I converted Snowflake queries into data build tool (dbt) workflows for better handling and quality mismatch.
Created a custom audit logging framework to capture all writing operations in the database, allowing efficient audit logging alerts for any unwanted write operations.

Education

2001 - 2005

Bachelor's Degree in Information Technology

Jaypee Institute of Information Technology - Noida, India

Skills

Libraries/APIs

PySpark

Tools

Cloud Dataflow, Google Kubernetes Engine (GKE), Terraform

Languages

Java, Python 3, Snowflake, SQL, Python

Paradigms

ETL Implementation & Design

Platforms

Google Cloud Platform (GCP), Amazon Web Services (AWS), Jupyter Notebook, Kubeflow

Storage

Databases

Other

Data Build Tool (dbt), Data Quality Governance, Stakeholder Engagement, CI/CD Pipelines, Cloud Migration, Oracle ATG Commerce, Algorithms, Machine Learning Operations (MLOps), APIs

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring