Harish Chander Ramesh
Verified Expert in Engineering
Data Engineer and Developer
Dubai, United Arab Emirates
Toptal member since April 22, 2022
Harish is a data engineer who has been consuming, engineering, analyzing, exploring, testing, and visualizing data for personal and professional purposes for the last ten years. His passion for data has led him to work with multiple Fortune 50 organizations, including Amazon and Verizon. Harish loves challenges and believes he can learn and deliver best when out of his comfort zone.
Portfolio
Experience
- BI Reporting - 10 years
- SQL - 9 years
- Apache Spark - 8 years
- Tableau - 8 years
- Python - 7 years
- Apache Airflow - 6 years
- Google Cloud Platform (GCP) - 5 years
- Microsoft Power BI - 4 years
Availability
Preferred Environment
Google Cloud Platform (GCP), Tableau, Microsoft Power BI, SQL, ETL, Business Intelligence (BI), Data Visualization, Amazon Web Services (AWS), Google BigQuery, Azure SQL Databases, Data Engineering, AWS Data Pipeline Service, Data Management, Collibra, Informatica Cloud, Informatica ETL, Informatica, Oracle, JavaScript, Data Architecture, Excel 365, CSV File Processing, Excel VBA, Data Extraction, MySQL, Real-time Data
The most amazing...
...data platform I've built from scratch is for a video conferencing app, which managed to have no downtime despite the 600% usage increase during the pandemic.
Work Experience
Data Engineer and Architect
United Talent Agency - Main
- Designed and implemented a visualization tool for monitoring queries across all environments, enabling the early identification and resolution of potential issues, which improved system reliability by 30% and optimized query performance by 25%.
- Created an automated service that effectively detects and resolves data quality issues throughout the development stages, leading to a 50% decrease in incidents and ensuring high data integrity and trustworthiness in the data lake project.
- Established a robust testing platform that identified reliability issues during the pre-production stages, enhancing the overall system stability and reducing downtime by 20% before full-scale deployment.
- Led a team of data engineers in identifying and addressing infrastructure gaps through the development of automated solutions, which streamlined operations and increased the team's productivity by 35%.
- Contributed significantly to the design, development, and maintenance of existing data warehousing and data lake projects.
- Developed and deployed a comprehensive framework for the data engineering team, significantly enhancing feature impact analysis and ensuring thorough testing before deployment, resulting in a 40% reduction in customer disruptions due to releases.
- Architected and executed a scalable data lake solution in Azure, integrating Snowflake, DBT, and Spark to support advanced analytics and machine learning projects, which increased data accessibility by 50% and reduced data processing time by 40%.
- Pioneered the use of machine learning tools and frameworks to automate data quality checks and anomaly detection, reducing manual data verification efforts by 70% and improving data accuracy for downstream analytics and ML model training.
- Implemented a CI/CD pipeline for seamless integration and delivery of data engineering and ML projects, which accelerated deployment cycles by 50% and fostered a culture of continuous improvement and innovation within the data engineering team.
Data Engineer Manager
MH Alshaya
- Developed the first-ever Data warehouse from scratch, incorporating product analytics at scale, using various GCP services.
- Developed the Golden Customer Record in real-time, extending the Loyalty program of 119 brands over 19 countries.
- Developed and maintained a data quality framework with the help of the entire business team in-house, using Great Expectations at scale. This was also used in fraud analytics across 50+ brands in near real-time.
- Led a team of six data engineers, the first set of data engineers in the organization, and started up a data-driven culture within the team.
Lead Data Engineer
Verizon Media
- Developed the first streaming analytics platform to handle media stats from videoconferencing solutions using Apache Spark and Storm on AWS-managed services.
- Built a data pipeline that autoscaled itself, not experiencing the impacts of the COVID-19 pandemic despite the 600% increase in the daily usage volume due to remote work implementation among clients’ teams.
- Tested and implemented Apache Hudi at its early stages of development, also providing ACID transactions the ability on historical data.
- Led a team of seven data engineers, three seniors, two juniors, and one intern. Created opportunities to interact with large clients worldwide on technical solution consultation and solution architecting.
- Migrated a live legacy database of PostgreSQL to Snowflake with DBT on the process with a size of 2.2 PB in five days. Designed, implemented, and validated the migration on the fly with the help of an error reporting framework with 0.3% of errors.
Data Engineer
Amazon
- Contributed to the world's largest eCommerce platform covering 16 marketplaces across the globe in different timezones. I was a part of the retail business team that handled the worldwide retail business data management and pipelines.
- Managed to handle high-pressure environments and meet tight deadlines. Worked alongside the best minds in the country and the world, initiating a data engineer forum within the organization for cross-polination of ideas among us.
- Built real-time pipelines to stream data from different platforms to the Amazon data warehouse with a service-level agreement (SLA) of a 2-minute time delay using Spark, Flink, and Tableau.
- Created a 360-degree dashboard with perspectives on Amazon's customers across different Amazon services. The dashboard was made public on a forum and gained massive popularity for the ease of data understanding by consumers.
Data Engineer
NTT Data
- Developed, tested, and deployed end-to-end real-time and Batch ETL pipelines for a healthcare provider.
- Documented every line of code and changes to the existing product from a business standpoint.
- Learned new technologies with an open-minded approach and grew as an agnostic developer.
- Developed two major data warehouse-related projects to save 23% of data storage cost and 26.5% of maintenance cost.
Experience
Competitive Price Monitoring System for eCommerce Business
Real-time Pipelines for Fraud Alerting
Driver's Incentives Framework
Education
Bachelor of Engineering Degree in Electronics
Anna University - Chennai, India
Certifications
AWS Certified Solutions Architect
Amazon Web Services
Google Cloud Certified - Professional Data Engineer
Google Cloud
Skills
Libraries/APIs
REST APIs, PySpark, Spark Streaming
Tools
Apache Airflow, Tableau, Microsoft Power BI, Abinitio, Kafka Streams, Google Analytics, Looker, BigQuery, Collibra, Informatica ETL, Excel 2016, AWS Glue, GitHub, Apache Beam, Amazon CloudWatch, Cloud Dataflow, Amazon Athena, ELK (Elastic Stack), Microsoft Access, pgAdmin, Amazon QuickSight, Amazon Elastic Container Service (ECS), Amazon CloudFront CDN, AWS CloudFormation, Git, Stitch Data, Azure Kubernetes Service (AKS), Matillion ETL for Redshift, Apache Storm, Logstash, Grafana, Terraform, Azure Machine Learning
Languages
SQL, Python, Snowflake, Looker Modeling Language (LookML), JavaScript, Excel VBA
Frameworks
Apache Spark, Spark, Streamlit, Storm, Hadoop, Django
Paradigms
ETL, Business Intelligence (BI), ETL Implementation & Design, Database Development, DevOps, Microservices
Platforms
Google Cloud Platform (GCP), Amazon EC2, Amazon Web Services (AWS), Firebase, AWS Lambda, Databricks, Linux, Kubernetes, Microsoft Fabric, Apache Flink, Azure, Airbyte, Azure Synapse, Oracle, Docker, Apache Kafka, Cloud Native
Storage
Teradata, Redshift, Databases, Amazon S3 (AWS S3), Data Pipelines, Data Lake Design, PostgreSQL, Azure SQL Databases, AWS Data Pipeline Service, MongoDB, Microsoft SQL Server, Database Architecture, Database Performance, NoSQL, Amazon Aurora, Datadog, Data Lakes, Google Cloud, Oracle Cloud, MySQL, Cloud Firestore, MemSQL, Elasticsearch
Other
Software, Dashboards, Data Visualization, Amazon RDS, Big Data, Data Warehouse Design, Data Warehousing, Data Engineering, Google BigQuery, Data Analysis, Data Build Tool (dbt), Cloud Platforms, Data Management, Informatica Cloud, Informatica, Data Architecture, Excel 365, Office 365, CSV File Processing, Data Migration, Data Extraction, ELT, Technical Architecture, ETL Tools, Cloud, Delta Lake, Pub/Sub, Azure Databricks, Warehouses, BI Reporting, Orchestration, Data Processing, Infrastructure as Code (IaC), Query Optimization, English, Data Cleaning, GitHub Actions, APIs, Reports, Distributed Systems, Looker Studio, Big Data Architecture, Data Modeling, Analytics, Data Analytics, Data Science, Data Governance, Parquet, Database Schema Design, Fivetran, TIBCO, Ads, Data Quality, Finance, Mobile Analytics, Monitoring, CI/CD Pipelines, Amazon EMR Studio, Web Analytics, Social Media Web Traffic, Real-time Data, Metabase, DocumentDB, User Interface (UI), Great Expectations Cloud, Machine Learning, ClickStream, Amazon MQ
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring