Ayan Chakraborty, Developer in Kolkata, West Bengal, India
Ayan is available for hire
Hire Ayan

Ayan Chakraborty

Bio

Ayan is a developer and seasoned technical leader with enthusiasm for data and specializing in leading data analytics projects and architecting data. Over the past decade, Ayan has worked hands-on with every part of the data lifecycle for data engineering and analytics, primarily in education, manufacturing, and retail. Thanks to his experience and expertise, Ayan understands business priorities and develops the project accordingly to achieve project goals efficiently.

Portfolio

PepsiCo Global - Main
SQL, Data Analysis, Data Analytics, Datasets, Python, Ad-hoc Analysis, eCommerce
Brightly - Main
Python, Transact-SQL (T-SQL), Data Engineering, AWS Glue, AWS CodeBuild, Linux...
Shega LLC
Data Engineering, Data Architecture, Big Data, ETL, Data Privacy...

Experience

  • Business Intelligence (BI) Platforms - 10 years
  • Stored Procedure - 9 years
  • Data Warehousing - 9 years
  • ETL Development - 9 years
  • SQL - 9 years
  • Tableau - 5 years
  • SQL Server 2010 - 5 years
  • Snowflake - 4 years

Preferred Environment

MacOS, Windows, PyCharm, DataGrip

The most amazing...

...thing I've made for an edtech startup is an end-to-end reporting and warehouse solution that scaled from ten students to 150,000 students without any issues.

Work Experience

Data Analysis Expertise

2025 - 2025
PepsiCo Global - Main
  • Migrated legacy Snowflake SQL views to dbt models for Instacart and DoorDash RSV and unmapped products, keeping business logic consistent.
  • Repointed ThoughtSpot Digital Shelf DQ Monitoring liveboards from views to new dbt tables, matching filters and KPIs, and improving performance.
  • Tuned SQL using EXPLAIN and warehouse diagnostics to reduce cost and runtime while maintaining result parity.
  • Created a safe rollout process using development-clone liveboards, TML backups, and “do not use” flags on deprecated models.
Technologies: SQL, Data Analysis, Data Analytics, Datasets, Python, Ad-hoc Analysis, eCommerce

Data Engineer

2024 - 2025
Brightly - Main
  • Optimized SQL Server query performance by analyzing execution plans, refactoring complex T-SQL queries, and implementing appropriate indexing strategies, resulting in reduced query execution time.
  • Monitored and troubleshot SQL Server performance issues using DMVs and query statistics, identifying bottlenecks related to CPU, memory, and I/O, and applying fixes to ensure stable and scalable data pipelines.
  • Performed performance tuning on large transactional and analytical tables by optimizing joins, partitions, and stored procedures, and resolving data type mismatches to improve overall database efficiency.
Technologies: Python, Transact-SQL (T-SQL), Data Engineering, AWS Glue, AWS CodeBuild, Linux, Bash, PySpark, AWS Lambda, AWS Step Functions

Data Architect Consultant

2024 - 2024
Shega LLC
  • Built and optimized end-to-end ETL pipelines on Azure using Azure Databricks (PySpark and SQL) for scalable data processing.
  • Implemented Delta Lake best practices (Z-Order, Liquid Clustering, Auto-Compaction, CDF, and Time Travel) to improve performance and reliability.
  • Designed curated analytics layers in ADLS Gen2 (Bronze, Silver, and Gold) and served reporting datasets via Azure Synapse Analytics.
  • Managed CI/CD and delivery using Azure DevOps, including version control, automated deployments, and production support.
Technologies: Data Engineering, Data Architecture, Big Data, ETL, Data Privacy, Data Warehousing, Data Lakes, Data Quality Governance, Data Security, Data Auditing, Data Visualization

Senior Data Architect

2023 - 2024
LotLinx, Inc
  • Processed 10+ terabytes of data using BigQuery, Remote Function, and Cloud Function within four hours daily using parallel processing.
  • Designed the data mesh from scratch and trained the team on best practices for implementing cloud data warehouses and how to scale.
  • Architected the data governance and security implementation process for over 16 million cars, 500+ dealers, and 1+ million customer data.
  • Designed and architected Looker as a visualization solution for the company and released more than 35+ dashboards.
  • Led the Looker implementation, building 35+ dashboards with row-level security and a scalable semantic layer, driving real-time insights across 16+ million vehicles and 500+ dealers through seamless BigQuery integration.
Technologies: Data Mesh, Cloud Architecture, ETL Tools, Google Cloud Platform (GCP), Google BigQuery, Looker, BigQuery

Senior Data Warehouse Architect

2021 - 2022
solarisBank
  • Architected a data warehouse (data mesh) from scratch, from S3 as a data lake to Snowflake as a data warehouse.
  • Handled data from multiple data platforms, for example, Samsung Pay. Dealt with a data warehouse design that could scale more than 15 TB of data.
  • Mentored a team size of more than 7+. Involved in scrum process implementation for the first time in the team and the design of agile processes.
  • Deployed Airflow with dbt in Snowflake, and in terms of data modeling, Data Vault 2.0, data mesh, and dimensional modeling. Also, employed data governance, metadata management, and data catalog with Collibra.
  • Built processes supporting data transformation, data structures, metadata, dependency, and workload management. Used streams, tasks, multi-table inserts, and Snowpipe.
  • Managed workload management, such as frequency, concurrency, scan size, copy, and SLA.
Technologies: Snowflake, Amazon Web Services (AWS), Data Vaults, Data Mesh, APIs, Apache Airflow, Data Build Tool (dbt)

Senior Data Architect

2021 - 2021
Yara International
  • Completed a comparative POC between Snowflake and Redshift, defined the main 15 use cases in the context of Yara as a business, and implemented Snowflake as final data as a service. Processed 500 MB geo files per batch from S3.
  • Implemented the scrum process in the team and guided three people as one of the founders, and hired another two for the company. Established scrum as a ceremony and implemented it for the team.
  • Deployed Airflow inside Docker and PostgreSQL for parallel processing for loading data into the current Hive data store from four different sources with even three layers of nested JSON.
  • Implemented DBT (data build tool) for around 1.8 million rows to process for each small-market state in India, Thailand, and APAC countries.
Technologies: Apache Airflow, Snowflake, Data Analytics, Data Modeling, Data Architecture, Data Build Tool (dbt), Data Analysis, Microsoft Fabric, DAX, Microsoft Power BI

Business Intelligence Development Lead | Data Architect

2018 - 2021
Alef Education
  • Designed and architected a data warehouse with Snowflake and made it a data vault from scratch to accommodate existing features and changes for the new features up to 10 TB in data size.
  • Created an automated reporting platform that has cut down $12,000 per year on licensing costs; the same platform has cut down 60% manual effort for custom reports and managed multiple data sources.
  • Configured Snowpipe for real-time reports from xAPI data on each click from students. Managed data governance and administration with Snowflake and Collibra.
  • Built a meaningful dashboard for seven different teams of stakeholders. Included CXOs in Tableau, reached out to over 450 school leaders, and impacted over 150,000 students' lives.
Technologies: Data Warehousing, Tableau, Apache Airflow, SQL, Redshift, Amazon Web Services (AWS), Python, Data Architecture, Snowflake, Data Analytics, Microsoft Power BI, DAX

Senior Business Intelligence Analyst

2018 - 2018
Mediabrands
  • Developed new data pipelines and workflows using Python, Apache Airflow, and Redshift, which reduced costs by 15% on custom schedulers, the cost of managing multiple platforms by 51%, and the whole SQL server maintenance cost.
  • Designed and developed the whole warehouse in Redshift with a data volume of 2.5 GB daily incremental on an 8-node cluster.
  • Co-led marketing data mining projects and pushed data to the data warehouse in Redshift from multiple sources like Facebook, Google DoubleClick, Google AdWords, and Datapoint.
Technologies: SQL Server Integration Services (SSIS), Python, Data Analytics, SQL Server 2010, Tableau, Data Architecture, Amazon Web Services (AWS)

Senior Software Engineer (Data and Analytics)

2016 - 2018
Nous Infosystems
  • Led four data analytics projects and developed data models and visualizations in Qlik and Tableau.
  • Architected a data warehouse for Liaison International in Boston (a company that analyzed educational data of US university applicants from all over the world) and grew the team from two people to six.
  • Won the award of star performer thanks to client satisfaction and increasing new revenue impact by 12% on two projects.
  • Performance-tuned SQL procedures for multiple clients such as Deloitte and Everest Reinsurance.
Technologies: SQL Server Integration Services (SSIS), Database Modeling, Tableau, Data Warehousing, Data Architecture, Microsoft Power BI

Senior Database Designer and Programmer

2016 - 2016
NetZoom
  • Designed the database schema, programmed T-SQL procedures, and performance-tuned, which improved the response time from 15% to 21%.
  • Created dashboards with Tableau for the CXO audience.
  • Implemented an Agile process to maintain team workflows with members in different geographic locations.
Technologies: SQL Server 2010, Stored Procedure, Data Architecture, Database Modeling, Data Warehousing

Development Team Lead

2014 - 2016
Acronym Solutions
  • Led teams on data warehouse projects regarding sales and marketing data analysis to restructure the cost and provide management with a solid dashboard to make better decisions for Khadim India, which reduced the cost of opening a new showroom by 26%.
  • Developed two data mining projects with a three million row intake per day, using SQL Server and Google Cloud SQL.
  • Created new business opportunities with clients, such as Khadim India and Electrosteel.
Technologies: PostgreSQL, Stored Procedure, SQL Server Integration Services (SSIS), Microsoft Power BI

CRM Consultant

2011 - 2014
Cognizant
  • Implemented and designed critical business requirements for the interface design using Informatica and Oracle's Siebel CRM. Processed both batch and on-request data with Informatica auto-scheduler.
  • Designed the Informatica data flow for AstraZeneca Australia and AstraZeneca Germany to get CRM data from the European and Australian markets and store them in the Oracle database for further report generation.
  • Created complex data transformation with Informatica and worked on the project for migration to Informatica from Datastage.
  • Won the "Best Trainee of the Year" award in 2011 for Oracle and Siebel training.
Technologies: Siebel CRM

Experience

Student Admission Reporting | Enterprise Data Warehouse (EDW) and Analysis

I worked as an on-site coordinator and offshore developer at a US-based education startup. The company is responsible for undergraduate admission processing for students from all over the world at numerous US universities.

I developed and led a team of four to build a pipeline for a data flow in SSIS. We were also responsible for more than 60 stored procedures that handled data processing. I also built the EDW architecture for reporting purposes and designed the Tableau dashboards; we integrated them with the platform so that the end-users could obtain insights.

Liaison ETL and Data Warehouse Design and Data Analytics

The client is a major educational software provider based in the US. They automate all the application process over a large number of US universities and educational organizations.

More than 7,000 programs rely on them for help identifying, engaging, and enrolling prospective students. They needed a data warehouse and dashboards for the university management to understand the applicant's details in a more meaningful way which would drive the future admission process to be better and smoother.

My Responsibilities:
• Understood the requirements from the users and the existing system.
• Designed and deployed the SSIS package for the data warehouse.
• Created critical T-SQL procedures to meet facts and dimensions needed.
• Designed cycle-wise application status and the application number heat map in Tableau Desktop.
• Implemented security-based user access organization-wise.
• Scheduled tasks for incremental data loads.

Transportation Analysis and Reports

The client is a major transportation provider based in the US and responsible for running trains all over the Bay Area. They needed to issue records, calculations, and reports to the MBTA concerning the mean miles between failures (MMBF) of all trains for each route they cover.

The app provides an analysis of MMBF of all trips for all routes, service type, peak type, and other parameters as well as to assess fleet performance—reliability and availability. Users will be able to analyze the data presented in detail at various levels through visualizations.

My Responsibilities:
• Thoroughly comprehended the requirements from the users and the existing system.
• Worked in Tableau, Python, and SSRS
• Created a complex Python script to build the data model for having rolling rank and
expected visualization.
• Designed fleet performance, rank, and locomotive geographical visualizations.
• Implemented security-based user access location-wise.
• Scheduled tasks for incremental data loads.

ERO Data Migration

ERO Data Migration came in to picture when some of the marketing companies under the AZ client agreed to move their business application from Siebel to Oracle CRM on Demand.

The data related to demographics, objective, activity, and product was migrated from the Siebel database to their respective systems and then once the data was validated and irrelevant data was filtered out, the required data was migrated to the CRMOD system.

My Responsibilities:
• Helped with an in-depth analysis of the business requirements.
• Used my complete understanding of the SIEBEL data model (party model, activity, meeting, product, and samples).
• Drew extracts from the Siebel database and in a particular model as it can be loaded to the OCOD database.
• Designed and implemented critical business requirements for the interface design
using Informatica.
• Developed Informatica mappings, workflows, and worklets (a group of tasks) for the implementation of critical business requirements of integrating multiple systems.
• Unit-tested the app in various modes and with various types of users.
• Involved in the system-integration testing of the app with the owner teams of the
aligned systems.
• Helped with market engagement from off-shore.

Education

2019 - 2019

Post-graduate Work in Data Science and Business Analytics

McCombs School of Business | University of Texas at Austin - Austin, TX, United States

2007 - 2011

Bachelor of Technology Degree in Electronics and Communication Engineering

West Bengal State University - Kolkata, India

Certifications

APRIL 2023 - APRIL 2026

AWS Certified Solutions Architect – Associate

Amazon Web Services

FEBRUARY 2023 - FEBRUARY 2025

SnowPro Core Certification

Snowflake.com

AUGUST 2022 - PRESENT

Modelling Data Warehouse with Data Vault 2.0

Udemy

JUNE 2019 - PRESENT

Apache Airflow Certification

Udemy

Skills

Libraries/APIs

PySpark

Tools

Microsoft Power BI, Tableau, Apache Airflow, Siebel CRM, Looker, BigQuery, AWS Glue, AWS CodeBuild, AWS Step Functions

Languages

SQL, Transact-SQL (T-SQL), Stored Procedure, Snowflake, Python, Python 3, Bash

Paradigms

ETL, HIPAA Compliance

Platforms

Azure, Amazon Web Services (AWS), Amazon EC2, Microsoft Fabric, Google Cloud Platform (GCP), Linux, AWS Lambda

Storage

Data Pipelines, Redshift, SQL Server Integration Services (SSIS), SQL Server 2010, Database Modeling, PostgreSQL, Data Lakes

Other

Data Warehousing, ELT, ETL Development, Data Architecture, Business Intelligence (BI) Platforms, Google BigQuery, Data Mining, Data Analytics, Data Build Tool (dbt), Data Engineering, DAX, Data Modeling, Data Vaults, Data Analysis, Data Mesh, APIs, Cloud Architecture, Cloud Services, AWS Cloud Architecture, ETL Tools, Large Language Models (LLMs), Mobile Apps, Artificial Intelligence (AI), General Data Protection Regulation (GDPR), Big Data, Data Privacy, Data Quality Governance, Data Security, Data Auditing, Data Visualization, Datasets, Ad-hoc Analysis, eCommerce

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring