Harish is available for hire

Harish Chander Ramesh

Verified Expert in Engineering

Data Engineer and Developer

Dubai, United Arab Emirates

Toptal member since April 22, 2022

Expertise

Cloud Engineering Data Migration Dashboard Data Visualization AWS RDS Data Warehouse Data Engineering Serverless Big Data Architecture Data Analysis Software Development Database Amazon S3 GitHub Apache Airflow ETL Streamlit Development

Bio

Harish is a data engineer who has been consuming, engineering, analyzing, exploring, testing, and visualizing data for personal and professional purposes for the last ten years. His passion for data has led him to work with multiple Fortune 50 organizations, including Amazon and Verizon. Harish loves challenges and believes he can learn and deliver best when out of his comfort zone.

Portfolio

Guardian Service Holdings LLC

API Integration, Google BigQuery, Data Lakes, Data Build Tool (dbt), Python...

United Talent Agency - Main

Data Engineering, Azure, Snowflake, Spark, Hadoop, Azure Machine Learning...

MH Alshaya

Apache Airflow, Apache Spark, Google Cloud Platform (GCP), Google Analytics...

Experience

BI Reporting - 10 years
SQL - 9 years
Apache Spark - 8 years
Tableau - 8 years
Python - 7 years
Apache Airflow - 6 years
Google Cloud Platform (GCP) - 5 years
Microsoft Power BI - 4 years

Preferred Environment

Google Cloud Platform (GCP), Tableau, Microsoft Power BI, SQL, ETL, Business Intelligence (BI), Data Visualization, Amazon Web Services (AWS), Google BigQuery, Azure SQL Databases, Data Engineering, AWS Data Pipeline Service, Data Management, Collibra, Informatica Cloud, Informatica ETL, Informatica, Oracle, JavaScript, Data Architecture, Excel 365, CSV File Processing, Excel VBA, Data Extraction, MySQL, Real-time Data

The most amazing...

...data platform I've built from scratch is for a video conferencing app, which managed to have no downtime despite the 600% usage increase during the pandemic.

Work Experience

BigQuery Data Analyst

2025 - 2025

Guardian Service Holdings LLC

Designed and implemented API-driven data pipelines integrating Vertafore, AgencyZoom, AMS 360, and PL Rating into a centralized BigQuery data lake, enabling unified analytics across core insurance systems.
Built and optimized BigQuery data models to transform raw operational data into analytics-ready schemas, improving query performance and enabling faster business reporting.
Implemented end-to-end data ingestion workflows using Python and SQL, ensuring reliable, scalable data flow from multiple third-party systems into Google Cloud.
Enabled marketing and web funnel analytics by tagging, transforming, and dispatching event data into the data lake and Google Analytics for downstream analysis.
Collaborated closely with non-technical stakeholders to translate business requirements into data models, dashboards, and actionable insights.
Created an open-source customer ID generation project end to end.

Technologies: API Integration, Google BigQuery, Data Lakes, Data Build Tool (dbt), Python, SPARQL

Data Engineer and Architect

2023 - 2024

United Talent Agency - Main

Designed and implemented a visualization tool for monitoring queries across all environments, enabling the early identification and resolution of potential issues, which improved system reliability by 30% and optimized query performance by 25%.
Created an automated service that effectively detects and resolves data quality issues throughout the development stages, leading to a 50% decrease in incidents and ensuring high data integrity and trustworthiness in the data lake project.
Established a robust testing platform that identified reliability issues during the pre-production stages, enhancing the overall system stability and reducing downtime by 20% before full-scale deployment.
Led a team of data engineers in identifying and addressing infrastructure gaps through the development of automated solutions, which streamlined operations and increased the team's productivity by 35%.
Contributed significantly to the design, development, and maintenance of existing data warehousing and data lake projects.
Developed and deployed a comprehensive framework for the data engineering team, significantly enhancing feature impact analysis and ensuring thorough testing before deployment, resulting in a 40% reduction in customer disruptions due to releases.
Architected and executed a scalable data lake solution in Azure, integrating Snowflake, DBT, and Spark to support advanced analytics and machine learning projects, which increased data accessibility by 50% and reduced data processing time by 40%.
Pioneered the use of machine learning tools and frameworks to automate data quality checks and anomaly detection, reducing manual data verification efforts by 70% and improving data accuracy for downstream analytics and ML model training.
Implemented a CI/CD pipeline for seamless integration and delivery of data engineering and ML projects, which accelerated deployment cycles by 50% and fostered a culture of continuous improvement and innovation within the data engineering team.

Technologies: Data Engineering, Azure, Snowflake, Spark, Hadoop, Azure Machine Learning, Data Architecture, Data Build Tool (dbt), Python, Orchestration, Data Processing, DevOps, Infrastructure as Code (IaC), Query Optimization, English, Data Cleaning, Cloud Dataflow, Metabase, GitHub Actions, REST APIs, Amazon Athena, Streamlit, Distributed Systems, Looker Studio, Looker, Dashboard Design, Power Query, Business Analysis, Google Analytics 4 (GA4), Marketing, Data Strategy, Azure Data Lake, Performance Tuning, Sharding, Architecture, AWS IoT, Data Transformation, Sales, R, Pandas, Cloud Data Fusion, SPARQL

Data Engineer Manager

2021 - 2022

MH Alshaya

Developed the first-ever Data warehouse from scratch, incorporating product analytics at scale, using various GCP services.
Developed the Golden Customer Record in real-time, extending the Loyalty program of 119 brands over 19 countries.
Developed and maintained a data quality framework with the help of the entire business team in-house, using Great Expectations at scale. This was also used in fraud analytics across 50+ brands in near real-time.
Led a team of six data engineers, the first set of data engineers in the organization, and started up a data-driven culture within the team.

Technologies: Apache Airflow, Apache Spark, Google Cloud Platform (GCP), Google Analytics, Tableau, ETL, Dashboards, Data Visualization, Amazon EC2, Amazon RDS, Databases, Redshift, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Spark, Apache Kafka, Data Warehouse Design, Data Lake Design, Big Data Architecture, Data Warehousing, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Analytics, Google Cloud, Data Analysis, Data Analytics, Data Science, Terraform, Data Governance, Azure, PostgreSQL, Cloud Platforms, Parquet, BigQuery, Database Schema Design, Data Management, Azure Synapse, Collibra, Informatica Cloud, Informatica ETL, Informatica, Ads, User Interface (UI), Excel 2016, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Oracle Cloud, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Data Migration, Finance, Mobile Analytics, Firebase, Data Extraction, Amazon Web Services (AWS), ELT, Database Architecture, Database Performance, Database Development, AWS Lambda, Docker, Microservices, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks, GitHub, NoSQL, Git, Pub/Sub, Warehouses, Machine Learning, BI Reporting, Amazon Aurora, Amazon CloudWatch, Web Analytics, ClickStream, Real-time Data, Kubernetes, Orchestration, Stitch Data, Data Processing, DevOps, Infrastructure as Code (IaC), Azure Kubernetes Service (AKS), Query Optimization, English, Data Cleaning, Cloud Dataflow, Metabase, GitHub Actions, DocumentDB, APIs, Matillion ETL for Redshift, REST APIs, Amazon Athena, Streamlit, Microsoft Power BI, Microsoft Fabric, Distributed Systems, Looker Studio, Looker Modeling Language (LookML), Dashboard Design, Power Query, Business Analysis, Google Analytics 4 (GA4), Marketing, Data Strategy, Performance Tuning, Sharding, Solution Architecture, Architecture, AWS IoT, Data Transformation, Sales, R, Pandas, API Integration, Cloud Data Fusion, SPARQL

Lead Data Engineer

2019 - 2021

Verizon Media

Developed the first streaming analytics platform to handle media stats from videoconferencing solutions using Apache Spark and Storm on AWS-managed services.
Built a data pipeline that autoscaled itself, not experiencing the impacts of the COVID-19 pandemic despite the 600% increase in the daily usage volume due to remote work implementation among clients’ teams.
Tested and implemented Apache Hudi at its early stages of development, also providing ACID transactions the ability on historical data.
Led a team of seven data engineers, three seniors, two juniors, and one intern. Created opportunities to interact with large clients worldwide on technical solution consultation and solution architecting.
Migrated a live legacy database of PostgreSQL to Snowflake with DBT on the process with a size of 2.2 PB in five days. Designed, implemented, and validated the migration on the fly with the help of an error reporting framework with 0.3% of errors.

Technologies: Apache Airflow, Apache Spark, Python, Tableau, ELK (Elastic Stack), Datadog, Kafka Streams, ETL, Dashboards, Data Visualization, Amazon EC2, Amazon RDS, Databases, Redshift, Storm, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Spark, Big Data, Apache Kafka, Data Warehouse Design, Data Lake Design, Spark Streaming, Big Data Architecture, Data Warehousing, PySpark, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Looker, Analytics, Google Cloud, Data Analysis, Snowflake, Data Analytics, Data Governance, Azure, PostgreSQL, pgAdmin, Data Build Tool (dbt), Cloud Platforms, Parquet, BigQuery, AWS Data Pipeline Service, Django, Database Schema Design, Data Management, Azure Synapse, Collibra, Informatica Cloud, Informatica ETL, Informatica, Amazon QuickSight, Ads, User Interface (UI), Excel 2016, JavaScript, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Oracle Cloud, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Microsoft SQL Server, Data Migration, Finance, Mobile Analytics, Firebase, Data Extraction, MySQL, ELT, Database Architecture, Database Performance, Database Development, AWS Lambda, AWS CloudFormation, Docker, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks, Delta Lake, GitHub, NoSQL, Linux, Git, Apache Beam, Pub/Sub, Warehouses, Machine Learning, BI Reporting, Amazon Aurora, Amazon CloudWatch, Web Analytics, Google Analytics, ClickStream, Social Media Web Traffic, Kubernetes, Orchestration, Data Processing, DevOps, Infrastructure as Code (IaC), Query Optimization, English, Data Cleaning, Cloud Dataflow, GitHub Actions, DocumentDB, REST APIs, Amazon Athena, Streamlit, Microsoft Power BI, Microsoft Fabric, Distributed Systems, Looker Studio, Looker Modeling Language (LookML), Dashboard Design, Power Query, Business Analysis, Google Analytics 4 (GA4), Marketing, Performance Tuning, Sharding, Solution Architecture, Architecture, Data Transformation, Pandas, API Integration, SPARQL

Data Engineer

2016 - 2018

Amazon

Contributed to the world's largest eCommerce platform covering 16 marketplaces across the globe in different timezones. I was a part of the retail business team that handled the worldwide retail business data management and pipelines.
Managed to handle high-pressure environments and meet tight deadlines. Worked alongside the best minds in the country and the world, initiating a data engineer forum within the organization for cross-polination of ideas among us.
Built real-time pipelines to stream data from different platforms to the Amazon data warehouse with a service-level agreement (SLA) of a 2-minute time delay using Spark, Flink, and Tableau.
Created a 360-degree dashboard with perspectives on Amazon's customers across different Amazon services. The dashboard was made public on a forum and gained massive popularity for the ease of data understanding by consumers.

Technologies: Apache Airflow, Apache Spark, Tableau, ETL, Dashboards, Data Visualization, Amazon EC2, Databases, Redshift, Storm, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Spark, Big Data, Apache Kafka, Data Warehouse Design, Data Lake Design, Spark Streaming, Big Data Architecture, Data Warehousing, PySpark, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Looker, Data Analysis, Data Analytics, Cloud Platforms, BigQuery, Azure SQL Databases, AWS Data Pipeline Service, Django, Data Management, Amazon QuickSight, Ads, Oracle, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Amazon Elastic Container Service (ECS), Microsoft SQL Server, Data Migration, Mobile Analytics, Firebase, Data Extraction, MySQL, ELT, Hadoop, Database Performance, Database Development, AWS Lambda, AWS CloudFormation, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks, Delta Lake, GitHub, NoSQL, Linux, Git, Apache Beam, Pub/Sub, Amazon EMR Studio, Warehouses, BI Reporting, Amazon Aurora, Amazon CloudWatch, Google Analytics, ClickStream, Social Media Web Traffic, Real-time Data, Orchestration, Data Processing, DevOps, Infrastructure as Code (IaC), Query Optimization, English, Data Cleaning, GitHub Actions, DocumentDB, REST APIs, Amazon Athena, Distributed Systems, Looker Studio, Marketing, Sharding, Pandas

Data Engineer

2013 - 2016

NTT Data

Developed, tested, and deployed end-to-end real-time and Batch ETL pipelines for a healthcare provider.
Documented every line of code and changes to the existing product from a business standpoint.
Learned new technologies with an open-minded approach and grew as an agnostic developer.
Developed two major data warehouse-related projects to save 23% of data storage cost and 26.5% of maintenance cost.

Technologies: Abinitio, SQL, Teradata, Amazon RDS, Amazon EC2, Databases, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Big Data, Data Warehousing, PySpark, Data Engineering, Data Analysis, Snowflake, Microsoft Access, Cloud Platforms, BigQuery, Azure SQL Databases, AWS Data Pipeline Service, Data Management, Azure Synapse, Informatica Cloud, Informatica ETL, Informatica, Amazon QuickSight, Oracle, Excel 2016, Data Architecture, Data Quality, Oracle Cloud, Excel 365, Office 365, CSV File Processing, ETL Implementation & Design, Microsoft SQL Server, Data Migration, Data Extraction, ELT, Hadoop, Database Development, AWS Lambda, ETL Tools, Cloud, Databricks, Delta Lake, GitHub, NoSQL, Linux, Git, Apache Beam, Pub/Sub, Warehouses, BI Reporting, Amazon CloudWatch, Social Media Web Traffic, Orchestration, Data Processing, Query Optimization, English, Data Cleaning, GitHub Actions, REST APIs, Amazon Athena, Distributed Systems, Data Transformation, Pandas

Experience

Competitive Price Monitoring System for eCommerce Business

The developed data framework will scrape multiple eCommerce websites based on their super-competitiveness. Super-competitiveness is the index to categorize different competitors for various product categories, used to scrape the competitor's websites one to three times a day. The output of the scraper script writes data to a data warehouse which will then be compared at the product-to-product level in real-time to generate a PCI. The price competitiveness index (PCI) is used to measure if the eCommerce business products are competitive compared to the super important and important competitors.

Sub-3-Second Fraud Detection Pipeline for a Hyperscale Video Conferencing Platform

Designed and built the streaming fraud-detection layer for a video conferencing platform during its 600% pandemic usage surge. Hijacked meeting IDs (Zoom-bombing-class attacks) were the number-one trust issue; the existing transactional system had no way to catch them in flight.

Shipped an end-to-end streaming pipeline that flags fraudulent join attempts in under three seconds, well inside the window where a meeting host can be alerted and act. Architecture: Apache Kafka for event ingestion, Apache Storm for low-latency stream processing, MemSQL (now SingleStore) as the hot store for sub-second lookups, Python for rule and ML signal evaluation, and Looker for trust-and-safety analyst tooling.

Deliberately chose an open-source stack to avoid vendor lock-in at the data layer. The same primitives were then reused for other real-time signals across the platform.

The pipeline ran with zero downtime through a 600% increase in traffic, processing millions of join events per day. It delivered a measurable reduction in fraudulent meeting incident reports to the trust-and-safety team.

Real-time Driver Incentives Platform for a Regional Ride Hailing Operator

I architected a real-time computation platform that calculates each driver's target-versus-actual delivery performance, awards instant bonuses, and pushes the live progress view directly into the driver's mobile app. Before this, drivers learned their numbers and incentives days late, which was killing engagement and target attainment.

It was built on the ELK stack (Elasticsearch, Logstash, and Grafana for visualization) running on Google Cloud Platform, with the dashboard embedded directly into the driver app so the driver can see targets, current progress, earned incentives, and what is still possible, all updated in near real time.

I designed the incentive rules engine to let the operations team change target structures without code changes, so promotion experiments could ship in days rather than sprints. This shifted the relationship between the data team and ops from a ticket-based model to a self-service model.

Impact: target attainment and daily active drivers improved measurably across the first two quarters, and the operations team ran far more incentive experiments per quarter than under the old reporting cadence.

2.2 PB Live PostgreSQL to Snowflake Migration

I led the migration of a 2.2 petabyte live PostgreSQL database to Snowflake, using dbt for the transformation layer, and completed it in five days with the source system remaining online throughout. I designed the cutover plan, the dbt-based modeling layer on the target, and an on-the-fly error-reporting framework that validated row-level fidelity during the move.

The final error rate landed at 0.3%, reconciled and resolved before the legacy system was decommissioned. The new Snowflake environment materially reduced downstream query latency and gave analytics and ML teams an ACID-compliant historical store for the first time, layered on top of Apache Hudi, which was still pre-1.0 at the time.

The architectural call: keep PostgreSQL writeable during migration, stream the delta, validate each batch against a hash-checksum, and cut over only once the error report falls below the threshold. That decision is why the business experienced zero downtime on the database underpinning a hyperscale video conferencing platform during its 600 percent surge in pandemic usage.

Enterprise Data Reliability Platform at United Talent Agency

As a Data Engineer and Architect at United Talent Agency, I led a team of 6-10 engineers to design and run the data reliability program for the agency's analytics and ML platform. Three connected workstreams: a cross-environment query monitoring and visualization tool, an automated data-quality detection-and-remediation service, and a pre-production testing platform.

Together, they delivered: 50% reduction in data-quality incidents, 30% lift in system reliability, 25% improvement in query performance, 20% reduction in pre-production downtime, and a 40% drop in customer-visible disruptions from releases, all measured against pre-program baselines.

On the platform side, I architected a scalable data lake on Azure, integrating Snowflake, dbt, and Spark to support advanced analytics and ML, increasing data accessibility by 50% and cutting processing time by 40%. Layered on top: ML-based automated data-quality checks and anomaly detection that reduced manual verification effort by 70%, and a CI/CD pipeline for data and ML projects that accelerated deployment cycles by 50%.

The leadership angle: team productivity rose 35% once the automated infrastructure was in place. Engineers stopped firefighting and started shipping.

Sub-2-Minute Real-time Pipelines Across 16 Amazon Retail Marketplaces

I built and operated real-time data pipelines that stream retail business data into the Amazon data warehouse, meeting a 2-minute SLA and covering 16 marketplaces across global time zones. Stack: Apache Spark and Apache Flink for the streaming layer, Amazon's internal warehouse on the storage side, and Tableau for the analytics surface.

I owned the architecture and SLA enforcement across regions, including the on-call rotation, the schema-evolution path for upstream changes, and the data contracts with retail business teams.

In addition to the pipelines, I designed a 360-degree customer dashboard that gives leaders cross-service visibility into Amazon customers. It was shared on Amazon's internal forum and adopted by teams well outside retail because of how cleanly it surfaced cross-product behavior, one of the first such dashboards inside the organization.

I founded an internal Data Engineer Forum to cross-pollinate ideas across teams. Small thing, but the kind of org-level move that gets noticed at a company that size.

Real-time Golden Customer Record and Fraud Analytics Across 50+ Retail Brands

I was hired as the first data engineering leader at a 119-brand, 19-country retail conglomerate. I built the first enterprise data warehouse from scratch on Google Cloud Platform, including product analytics at retail scale.

On top of the warehouse, I shipped the real-time Golden Customer Record that extended the group's loyalty program across all 119 brands, unifying identity, transaction, and engagement signals into a single record consumed in real time by the loyalty engine, CRM, and merchandising teams.

I designed and rolled out a data quality framework using Great Expectations at scale, co-built with the business teams, ensuring the rules captured real domain logic rather than engineering assumptions. The same framework powered near-real-time fraud analytics across 50+ brands.

Leadership: built the data engineering team from zero (first six hires) and established the data-driven culture inside an org that had previously been report-driven. The team I built continued to run the platform after I rolled off.

23% Storage and 26.5% Maintenance Cost Reduction on a Healthcare Data Warehouse

I designed and shipped two data-warehouse modernization projects for a US healthcare provider that delivered hard, measured infrastructure savings: 23% reduction in data storage costs and 26.5% reduction in maintenance costs, measured against the prior baseline and sustained over multiple billing cycles.

The work covered both real-time and batch ETL pipelines, with cost reductions driven by schema and partition redesign, implementation of retention-tier policies, replacing redundant pipelines with consolidated ones, and turning off cost centers that were running on autopilot.

The maintenance side was equally important. Every line of code and product change was documented from a business standpoint, so the platform stayed cheap to operate after I rolled off. The savings stuck.

This was an end-to-end engagement: requirements gathering with the business, technical design, implementation, parallel-run validation, and handover documentation. The combined storage and maintenance savings paid back the engagement in under two quarters.

Enterprise Data Quality, Governance, and Catalog Program

Across multiple enterprise engagements, I designed and rolled out the data quality, governance, and catalog layer that moves an organization from 'no one knows what data exists' to a published, searchable catalog with named owners on every critical dataset.

Technical anchors used repeatedly: Collibra as the catalog and stewardship surface, Informatica for ETL lineage and data quality enforcement, and Great Expectations for in-pipeline validation. The non-technical anchors matter more: defining the ownership model, the steward escalation path, the PII classification rules, and the policy for what gets cataloged versus what stays dark.

Outcome pattern: incident reduction in the 40-50% range (consistent with the 50% I delivered at United Talent Agency), and a measurable drop in the 'I can't find the data I need' friction that paralyzes most enterprise analytics teams.

The strategic value: governance is what separates senior data engineers from principals. Most individual contributors avoid it. Principals own it end-to-end and bring the business along.

Education

2009 - 2013

Bachelor of Engineering Degree in Electronics

Anna University - Chennai, India

Certifications

AUGUST 2024 - PRESENT

AWS Certified Solutions Architect

Amazon Web Services

JANUARY 2023 - PRESENT

Google Cloud Certified - Professional Data Engineer

Google Cloud

Skills

Libraries/APIs

REST APIs, Pandas, PySpark, Spark Streaming

Tools

Apache Airflow, Tableau, Microsoft Power BI, Abinitio, Kafka Streams, Google Analytics, Looker, BigQuery, Collibra, Informatica ETL, Excel 2016, AWS Glue, GitHub, Apache Beam, Amazon CloudWatch, Cloud Dataflow, Amazon Athena, Power Query, ELK (Elastic Stack), Microsoft Access, pgAdmin, Amazon QuickSight, Amazon Elastic Container Service (ECS), Amazon CloudFront CDN, AWS CloudFormation, Git, Stitch Data, Azure Kubernetes Service (AKS), Matillion ETL for Redshift, Apache Storm, Logstash, Grafana, Terraform, Azure Machine Learning

Languages

SQL, Python, Snowflake, Looker Modeling Language (LookML), R, SPARQL, JavaScript, Excel VBA

Frameworks

Apache Spark, Spark, Streamlit, Storm, Hadoop, Django

Paradigms

ETL, Business Intelligence (BI), ETL Implementation & Design, Database Development, DevOps, Application Architecture, Microservices

Platforms

Google Cloud Platform (GCP), Amazon EC2, Amazon Web Services (AWS), Azure, Firebase, AWS Lambda, Databricks, Linux, Kubernetes, Microsoft Fabric, AWS IoT, Apache Flink, Airbyte, Azure Synapse, Oracle, Docker, Apache Kafka, Cloud Native, Apache Hudi

Storage

Teradata, Redshift, Databases, Amazon S3 (AWS S3), Data Pipelines, Data Lake Design, PostgreSQL, Azure SQL Databases, AWS Data Pipeline Service, MongoDB, Microsoft SQL Server, Database Architecture, Database Performance, NoSQL, Amazon Aurora, Datadog, Data Lakes, Google Cloud, Oracle Cloud, MySQL, Cloud Firestore, MemSQL, Elasticsearch

Industry Expertise

Marketing

Other

Software, Dashboards, Data Visualization, Amazon RDS, Big Data, Data Warehouse Design, Data Warehousing, Data Engineering, Google BigQuery, Data Analysis, Data Build Tool (dbt), Cloud Platforms, Data Management, Informatica Cloud, Informatica, Data Architecture, Excel 365, Office 365, CSV File Processing, Data Migration, Data Extraction, ELT, Technical Architecture, ETL Tools, Cloud, Delta Lake, Pub/Sub, Azure Databricks, Warehouses, BI Reporting, Orchestration, Data Processing, Infrastructure as Code (IaC), Query Optimization, English, Data Cleaning, GitHub Actions, APIs, Reports, Distributed Systems, Looker Studio, Dashboard Design, Business Analysis, Google Analytics 4 (GA4), Data Strategy, Performance Tuning, Sharding, Serverless, Data Transformation, API Integration, Big Data Architecture, Data Modeling, Analytics, Data Analytics, Data Science, Data Governance, Parquet, Database Schema Design, Fivetran, TIBCO, Ads, Data Quality, Finance, Mobile Analytics, Monitoring, CI/CD Pipelines, Amazon EMR Studio, Web Analytics, Social Media Web Traffic, Real-time Data, Metabase, DocumentDB, SAP, Azure Data Lake, Solution Architecture, Architecture, Sales, Cloud Data Fusion, User Interface (UI), Great Expectations Cloud, Machine Learning, ClickStream, Amazon MQ

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring