Siddharth Chabra, Developer in Gurugram, Haryana, India
Siddharth is available for hire
Hire Siddharth

Siddharth Chabra

Verified Expert  in Engineering

Data Engineer and Developer

Location
Gurugram, Haryana, India
Toptal Member Since
July 13, 2022

Siddharth is a seasoned professional with 15 years of experience. He worked in multiple areas such as image processing, artificial neural networks, and data warehousing. Siddharth specializes in cloud data warehousing, working mainly with BigQuery and Snowflake.

Portfolio

Elysium Health, Inc.
Data Engineering, Google BigQuery, ETL, Data Warehouse Design, Looker...
EVEREST GROWTH PARTNERS LLC
Python, JSON, Scraping, Web Crawlers, NumPy, Amazon RDS, Cloud, Big Data...
US-based Venture Capital Firm
Data Warehousing, Google BigQuery, PostgreSQL, Data Modeling, Data Architecture...

Experience

Availability

Full-time

Preferred Environment

GitHub, Python, SQL, REST APIs, Google BigQuery, Snowflake, Data Build Tool (dbt), Data Engineering, Data Architecture, Data Analytics

The most amazing...

...project I've completed is the single-handed creation of a data warehouse for a D2C eCommerce client in just 60 days.

Work Experience

Data Architect

2023 - 2023
Elysium Health, Inc.
  • Deployed new data models to support new product launches. Kept the legacy data models operative and made the data models backward compatible to support the legacy data models.
  • Developed a GA4 data model to support the client's marketing needs, helping the marketing team understand and transition from the existing UA3 models to the new GA4 data models.
  • Optimized data pipelines on Fivetran to reduce costs for the client.
Technologies: Data Engineering, Google BigQuery, ETL, Data Warehouse Design, Looker, Data Warehousing, Direct to Consumer (D2C), Data Build Tool (dbt), Fivetran, NumPy, Google Analytics 4, ETL Pipelines, Cloud, Databases

System Architect | Data Engineer

2022 - 2023
EVEREST GROWTH PARTNERS LLC
  • Developed a system to parse large-scale nested JSON files into BigQuery.
  • Developed a proprietary JSON parser to parse terabyte-sized JSON files in minutes.
  • Developed an API to make the system automated and attachable to any UI.
Technologies: Python, JSON, Scraping, Web Crawlers, NumPy, Amazon RDS, Cloud, Big Data, Databases

Director of Data Engineering

2019 - 2022
US-based Venture Capital Firm
  • Built the data warehousing business from scratch. Hired a team of data engineers, business analysts, data analysts, and business intelligence analysts.
  • Led the development of 30+ data warehouses for three years with a small team of 20 people.
  • Oversaw the development of 1,000+ visualizations on multiple database BI tool combinations, including Snowflake with Looker, Snowflake with Sigma Computing, Snowflake with Tableau, and Big Query with Looker.
Technologies: Data Warehousing, Google BigQuery, PostgreSQL, Data Modeling, Data Architecture, Pandas, Data Pipelines, Program Management, Marketing Analytics, Cohort Analysis, Rundeck, GitHub Actions, Amazon MWS, Amazon API, REST APIs, Windows, Visual Studio Code (VS Code), Data Structures, Database Design, Database Schema Design, Integration, Fivetran, JSON, Data Analysis, Redshift, Microsoft Excel, Analytics, APIs, BigQuery, Google Cloud Platform (GCP), Amazon Web Services (AWS), Star Schema, Google Cloud, Google Cloud Functions, Cloud Tasks, Cloud Run, Apache Airflow, AWS Glue, ELT, ETL Tools, Data Aggregation, Data Visualization, Business Intelligence (BI), Google Analytics, Google Analytics API, Data Reporting, Dashboards, Direct to Consumer (D2C), Looker, Database Modeling, Consolidation, Minimum Viable Product (MVP), Scraping, Web Crawlers, NumPy, Amazon RDS, Google Analytics 4, ETL Pipelines, Cloud, Big Data, Database Migration, Databases

Software Engineer

2017 - 2019
Freelance
  • Developed a GTO optimized playing poker trainer for a professional poker player.
  • Expanded the trainer from just handling heads-up Texas hold'em to 6-max Texas hold'em as well as PLO4.
  • Created and delivered the MVP with relevant documentation to the client in 10 weeks.
Technologies: PokerTracker 4, Poker, Microsoft Excel, APIs, Google Cloud Platform (GCP), Google Cloud, Google Cloud Functions, Cloud Tasks, Cloud Run, ELT, ETL Tools, Data Aggregation, Google Analytics, Google Analytics API, Dashboards, Database Modeling, NumPy, Cloud

Senior Consultant

2012 - 2017
Infosys
  • Acted as a program manager of a team of 50 process mapping experts to map 11,000 BAU processes. The project had a twofold goal, regulatory requirements and process optimization. Delivered the project on time and 30% under budget.
  • Helped a large US-based hedge fund prevent front running by designing a system for a large-scale data obfuscation project, which transformed 100 TB of production data into 100 TB of obfuscated data that analysts and developers could work with.
  • Got four promotions in three years and was on the fast track to becoming a partner in Infosys Consulting.
Technologies: Insurance, Banking & Finance, Program Management, Data Architecture, ETL, Data Analytics, Process Design, Data Warehousing, Data Modeling, Pandas, Automation, Data Pipelines, Cohort Analysis, Analytics, GitHub Actions, REST APIs, C, C++, Image Processing, Visual Studio Code (VS Code), Marketing, Finance, Data Structures, JSON, Microsoft Excel, Business Intelligence (BI), Google Analytics, Data Reporting, Database Modeling, Scraping, Web Crawlers, Cloud

Senior Software Engineer

2007 - 2010
Newgen Software Technologies Limited
  • Created an artificial neural network-based image analysis software to automatically identify fraudulent signatures on cheques and documents. The system had a successful identification rate of over 80% and a false positive error rate of less than 2%.
  • Ported the organization image processing library for C (32-bit) to C++ (64-bit), C# (64-bit), and Python.
  • Filed for eight patents in image processing, artificial neural networks, and document security areas.
Technologies: C, C++, Data Structures, Python, Artificial Neural Networks (ANN), MATLAB, C#, JSON, Microsoft Excel, Data Reporting, Scraping, Web Crawlers

Data Pipelines for a Snowflake Data Warehouse

The project included creating 30 data pipelines for a US-based D2C eCommerce client. I replaced Fivetran.com as their pipeline provider, reducing their monthly data pipeline cost from $10,000 to $1,500. I also created pipelines for data sources not supported by Fivetran.

Shopify Support for DHL

Shopify did not support pulling order status from DHL. Therefore, I created a reverse ETL pipeline that would pull the tracking information from the customer database and push it back to Shopify. It enabled Shopify to send emails about the order status update and show order tracking information to the customer.

Poker Hand Evaluator

The aim was to create software that would calculate game theory and optimize the play in different situations for poker players. This project was purpose-built for a high-stakes poker player regularly playing in Las Vegas. The software was integrated with PokerTracker 4.

Product Marketing Dashboard

I created a cross-platform dashboard, allowing the client to see product-specific marketing spends and metrics across marketing platforms such as Google, Facebook, Pinterest, TikTok, and Snapchat. This report enabled the CMO to drive higher efficiency for every marketing dollar spent.

Logistics Auditing Function

I set up a system to identify and report potential overcharges on logistics freight applied to orders. It was done by codifying the rate cards of various providers, calculating the billed cost, and comparing them with the contracted cost for each order.

Languages

SQL, Snowflake, Python, C, C++, C#

Libraries/APIs

Amazon MWS, Amazon API, Google Analytics API, NumPy, REST APIs, Pandas

Tools

BigQuery, Microsoft Excel, Rundeck, MATLAB, GitHub, Apache Airflow, Google Analytics, Looker, AWS Glue

Paradigms

Automation, ETL, Database Design, Business Intelligence (BI)

Platforms

Google Cloud Platform (GCP), Windows, Visual Studio Code (VS Code), Amazon Web Services (AWS), Cloud Run

Storage

Data Pipelines, JSON, Database Modeling, Database Migration, Databases, PostgreSQL, Redshift, Google Cloud

Other

Data Engineering, Google BigQuery, Data Build Tool (dbt), Data Modeling, Data Architecture, Program Management, Data Structures, Data Warehousing, Database Schema Design, Reporting, Integration, Data Analysis, APIs, ELT, ETL Tools, Data Aggregation, Data Visualization, Data Reporting, Dashboards, Direct to Consumer (D2C), Data Warehouse Design, Scraping, Web Crawlers, Amazon RDS, Google Analytics 4, ETL Pipelines, eCommerce, Web Scraping, Data Analytics, Marketing Analytics, Cohort Analysis, GitHub Actions, Image Processing, Analytics, Game Theory, PokerTracker 4, Fivetran, Star Schema, Google Cloud Functions, Cloud Tasks, Consolidation, Cloud, Big Data, Neural Networks, Finance, Structured Finance, Process Design, Artificial Neural Networks (ANN), Poker, Minimum Viable Product (MVP)

Industry Expertise

Marketing, Insurance, Banking & Finance

2010 - 2012

Master's Degree in General Business Administration (MBA)

Indian Institute of Management Calcutta - Calcutta, India

2003 - 2007

Bachelor's Degree in Computer Science

Delhi College of Engineering - Delhi, India

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring