Adam is available for hire

Adam Ivansky

Verified Expert in Engineering

Data Engineering Developer

Buffalo, NY, United States

Toptal member since November 6, 2018

Expertise

Data Science Data Engineering Recommendation Systems SQL Python Git Spark AWS EMR Apache Airflow Amazon S3 Machine Learning Django Agile Development SAS

Bio

Adam is a Senior Data Engineering Tech Lead and Architect who builds scalable data platforms, real-time streaming pipelines, and Agentic LLM and Generative AI automation systems. He delivered a high-concurrency live-streaming analytics platform that powers thousands of enterprise dashboards. An expert in Infrastructure as Code (IaC) and full-lifecycle platform engineering, Adam builds internal APIs and cloud infrastructure using Python, Snowflake, AWS, Terraform, and PostgreSQL.

Portfolio

Endeavor

Python 3, Snowflake, Amazon Elastic Container Service (ECS), Terraform, Docker...

Apple

Python 3, Python API, Amazon EKS, Docker, Kubernetes, Amazon S3 (AWS S3)...

BJ's Wholesale Club

Jenkins, AWS Command Line Interface (CLI), Amazon S3 (AWS S3), Redshift...

Experience

SQL - 12 years
Python 3 - 10 years
Data Engineering - 10 years
Amazon Web Services (AWS) - 10 years
Data Architecture - 6 years
AI Agents - 3 years
Terraform - 3 years
Artificial Intelligence (AI) - 3 years

Preferred Environment

Amazon Web Services (AWS), Python, Terraform, Snowflake, ETL, Streaming Data, AI Agents, REST APIs, Artificial Intelligence (AI), Cloud Infrastructure

The most amazing...

...project I've worked on is developing a live-streaming data analytics platform on Snowflake and AWS, which was viewed by hundreds of users.

Work Experience

Full-stack Data Engineering Tech Lead

2021 - PRESENT

Endeavor

Defined engineering excellence by establishing CI/CD gates, rigorous code standards, and formalized Agile SDLC processes. Pioneered robust unit testing frameworks to optimize code quality, deployment velocity, and pipeline reliability.
Architected and productionalized Generative AI pipelines integrating Mixtral 8x7B and Snowflake Cortex LLMs into an enterprise data platform; engineered high-performance RAG vector databases to optimize proprietary marketing intelligence systems.
Designed a unit and integration tests to be used by Claude Code, allowing for full prompt-based agentic coding development.
Built and designed both batch and streaming ETL pipelines to facilitate the movement of data in and out of the Snowflake data warehouse.
Made strategic decisions regarding the selection of technology and architectural design.
Administered multiple company Snowflake accounts and architected live as well as streaming data models using dbt. Managed permissions for hundreds of users, managed PII permissions and custom roles.
Provisioned resources such as EKS, S3, Lambda, and Transfer Family in AWS accounts using Infrastructure as code (IaC) and Terraform.
Designed and built REST APIs and internal websites to expose marketing and pricing models across the organization, integrating them with Amazon Cognito, Okta, and Microsoft Active Directory.

Technologies: Python 3, Snowflake, Amazon Elastic Container Service (ECS), Terraform, Docker, Apache Airflow, Data Build Tool (dbt), AWS Cloud Architecture, Database Design, Large Language Models (LLMs), REST APIs, FastAPI, Data Engineering, Technical Leadership, Data Architecture, Python 2, Amazon Elastic MapReduce (EMR), Spark, Spark SQL, Git, Machine Learning, Agile, Linux, Amazon EC2, Unit Testing, Data Lakes, Web Applications, Data Analytics

Data Engineering Tech Lead

2020 - 2021

Apple

Served as a data engineer in charge of two projects end-to-end. The projects involved collecting data from 3rd-party cloud vendors.
Developed scheduled ETLs based on Python and Spark that collected data from various APIs and loaded the data to Amazon S3 and PostgreSQL databases. The ETLs were deployed to Airflow and Kubernetes.
Built a number of APIs that were exposing data from the data warehouse to consumers of the data.
Created and modified ETLs based on AWS Glue. Created a serverless ETL based on Amazon SQS and AWS Lambda.

Technologies: Python 3, Python API, Amazon EKS, Docker, Kubernetes, Amazon S3 (AWS S3), Amazon Simple Queue Service (SQS), Amazon Elastic MapReduce (EMR), Redshift, PostgreSQL, SQL, Spark, Data Engineering, Data Architecture, Python 2, Spark SQL, Git, Machine Learning, Agile, Linux, Amazon EC2, Unit Testing, Snowflake, Data Lakes, Web Applications, Data Analytics, Terraform

Data Engineer

2019 - 2020

BJ's Wholesale Club

Developed an ETL pipeline based on PySpark running on Amazon EMR for the extraction of data from Redshift to S3.
Contributed to a product recommendation engine based on Spark machine learning.
Developed a data quality assessment tool in PySpark.
Owned cloud cost reporting. Managed EMR cluster creation and termination in AWS CLI and AWS console.
Automated the entire ETL and marketing pipeline using Jenkins.
Contributed to the algorithm for identifying new prospective members based on 3rd-party data.

Technologies: Jenkins, AWS Command Line Interface (CLI), Amazon S3 (AWS S3), Redshift, Python 3, Spark, Amazon Elastic MapReduce (EMR), SQL, Data Engineering, Python 2, Spark SQL, Git, Machine Learning, Agile, Linux, Amazon EC2, Unit Testing, Bitbucket, Data Lakes, Web Applications, Data Analytics

Senior Database Marketing Analyst

2017 - 2018

eBay

Developed targeting scripts for flagship marketing campaigns with an emphasis on email, mobile push notification, social, and on-site channels. The campaigns often targeted over 50 million users and sometimes resulted in over $100,000 in iGMB annually.
Designed, developed, implemented, and maintained multi-armed bandit algorithms written in Python while adhering to marketing standards and processes within eBay. The algorithm was measured to generate $5 million annually.
Trained an algorithm for send-time optimization. This has resulted in a 15% increase in click-through-rate in campaigns where it was implemented.
Assessed existing email, social, and mobile marketing campaigns in terms of KPIs such as iGMB, OR, and CTR.
Created dashboards in Tableau that reported on the performance of different marketing algorithms I developed.
Created scripts that moved data between HIVE and Teradata servers.
Worked with the largest Teradata DWH in the world and often queried tables with 100+ billion rows.
Communicated with stakeholders across multiple time zones.

Technologies: SQL, TensorFlow, Scikit-learn, Tableau, PySpark, Apache Hive, Python, Teradata, Python 3, Spark, Data Engineering, Python 2, Git, Machine Learning, Agile, Linux, Amazon EC2, Unit Testing, Data Lakes, Data Analytics

Machine Learning SW Developer

2016 - 2017

Valeo

Developed and trained a machine vision algorithm for recognizing pedestrians in front of vehicles, which has been implemented in several vehicle models, including the GM 2019 Chevy.
Trained an algorithm to detect dirt on camera lenses. This algorithm had a crucial role in supporting other more complex self-driving functionalities.
Assessed the quality of unstructured annotated video data used for algorithm training.
Created a script for synchronization of both structured and unstructured data between multiple teams who participated on the project.
Attended computer science conferences and studied scientific literature to keep up with new machine learning and computer science trends. Engaged in knowledge exchange with other team members.
Communicated and networked with teammates and stakeholders from France and Ireland.

Technologies: Protocol Buffers, OpenCV, SQL, MATLAB, Python, Python 3, Data Engineering, Git, Machine Learning, Agile, Linux, Unit Testing, Data Analytics

Credit Risk Analyst

2014 - 2015

Erste Group

Calculated risk parameters CCF, LGD, and PD according to BASEL 2.
Reduced the overall reserve requirements of Erste Bank subsidiaries by over 7% thanks to the improvements in the statistical engine for calculation of risk parameters CCF, LGD, and PD that I have introduced.
Designed and trained a mathematical model in SAS for prediction of the overall loss in the event of a client default. This helped Erste improve the repossession process and reduce expenses.
Performed ad-hoc stress tests for Erste subsidiaries. The results were later submitted directly to the European National Bank.
Assessed risk portfolio stability via bootstrapping and Monte Carlo methods.
Created interactive dashboards for risk parameter reporting in Microsoft SQL and Excel.
Developed a data quality testing system in SAS and SQL.

Technologies: Microsoft Excel, MATLAB, Microsoft SQL Server, SAS, SQL, Git, Machine Learning, Agile, Linux, Data Analytics

Teaching and Research Assistant

2012 - 2014

University of Rochester

Conducted teaching and lab lectures for undergraduate students.
Developed software for the automation of experiments and analyzed data produced by the experiments.
Authored several scientific papers that are available online.

Technologies: MATLAB, Image Recognition, Pattern Recognition

Experience

Self-hosted Streamlit Dashboarding Web App

I developed and managed an enterprise-grade web platform on AWS ECS that served Streamlit business performance dashboards backed by robust Snowflake data models. To ensure strict security compliance, the architecture integrated Okta and Azure AD Entra for federated authentication and fine-grained permissions management. Deployed via a modern GitOps framework, the platform empowered developers to contribute via a unified Python GitHub repository, utilizing GitHub Actions pipelines to orchestrate automated deployments across fully isolated development, UAT, and production environments.

Snowflake to White-label Ticketing Platform Data Integration

I architected and engineered an end-to-end real-time data integration between a white-label ticketing platform and Snowflake. The core architecture features a high-throughput, serverless ingestion pipeline using AWS API Gateway and AWS Lambda to process live webhook events, streaming them directly into Snowflake via Snowpipe. To complement the real-time stream, a custom API scraping solution was implemented to ingest static product catalog data. Inside Snowflake, data normalization and flattening were automated using Dynamic Tables, delivering low-latency, high-performance data models. This robust data foundation directly powers critical operations dashboards used for real-time seating inventory management and dynamic ticket pricing.

AI Agent-driven Sales Agent Enablement

I spearheaded the design and implementation of an enterprise-grade agentic AI platform leveraging a self-hosted Mixtral Mixture of Experts (MoE) LLM to automate advanced lead intelligence. The system utilizes an event-driven function-calling framework to execute semantic searches across Snowflake data warehouses and distributed APIs. By orchestrating a secure retrieval-augmented generation (RAG) pipeline, the autonomous agent ingests multi-source data payloads and performs contextual text synthesis to generate structured, intent-driven conversation starters. This private LLM deployment mitigates data leakage while serving low-latency, production-ready GenAI insights directly into Salesforce, optimizing the outbound sales funnel through automated, cognitive pre-call research.

Model for Dynamic Content Optimization and Customization

The aim of the project was to increase the click-through rate of eBay coupon campaigns via the use of machine learning. The development of the algorithm was successful, and it was measured to generate a 20% lift in click-through rate and IGMB.

The early version of the algorithm was based on the multi-armed bandit. Later versions made use of contextual NLP-based multi-armed bandit. The algorithm was developed using a combination of Teradata SQL and Python. I also developed an interactive Tableau dashboard in order to monitor the function of the algorithm and to measure the KPI lift that the algorithm was bringing.

Model for Pedestrian Detection Intended for Self-driving Vehicles

The project aimed to develop a machine vision algorithm capable of detecting pedestrians in front of a vehicle by analyzing the input from the vehicle camera. The algorithm is now fully functional and embedded into several newer vehicle models, including the GM 2019 Chevy.

The machine learning algorithm we decided to use was the AdaBoost cascade classifier combined with a deep neural network. We wrote the training application from scratch in C++. Training had to be multithreaded in order to be efficient. Testing and validation were done in Python. A large database of annotated video data was used for algorithm training.

Prediction Model

Precise prediction of the total final loss after a client's default is key to reducing the risk associated with different loan products.

I developed a model that relied on the loan-to-value ratio and the value of the collateral. It was done using a combination of SAS and Microsoft SQL Server. The development of the model required extensive data cleaning and data quality testing.

Product Recommendation Algorithm

Involved in the development of a recommendation engine based on a collaborative filtering model. The engine was capable of recommending even the products that a given customer did not necessarily buy in the past. The solution was implemented in PySpark and was based on MLlib, Spark's machine learning (ML) library.

ETL for Recommendation Algorithm

Developed an ETL in PySpark to transfer data from Amazon Redshift into an Amazon S3 data lake. I also developed code for customer-level data aggregation and historicization. Finally, I assessed data quality and investigated and remediated data quality issues.

Education

2012 - 2014

Master of Science Degree in Physics

University of Rochester - New York, USA

2008 - 2012

Bachelor's Degree in Physics

National University of Ireland, Galway - Galway, Ireland

Certifications

JANUARY 2023 - JANUARY 2026

AWS Certified Developer

AWS

JANUARY 2023 - JANUARY 2026

AWS Certified Cloud Practitioner

AWS

Skills

Libraries/APIs

PySpark, Django ORM, Scikit-learn, TensorFlow, OpenCV, Amazon EC2 API, Python API, PyTorch, REST APIs, Pandas, Salesforce API

Tools

Amazon Elastic MapReduce (EMR), Apache Airflow, Git, Spark SQL, AWS Glue, Bitbucket, Tableau, MATLAB, Microsoft Excel, Jenkins, AWS Command Line Interface (CLI), Amazon EKS, Amazon Simple Queue Service (SQS), Terraform, Amazon Elastic Container Service (ECS), GitHub, Prefect

Languages

SQL, Python 3, Python 2, Python, SAS, Snowflake

Frameworks

Spark, Django, Streamlit

Paradigms

Unit Testing, Agile, Continuous Integration (CI), ETL, Database Design

Storage

Amazon S3 (AWS S3), Teradata, Redshift, Microsoft SQL Server, Apache Hive, PostgreSQL, Data Lakes

Industry Expertise

Marketing

Platforms

Windows, Linux, Amazon EC2, Spark Core, Docker, Kubernetes, Amazon Web Services (AWS), Visual Studio Code (VS Code), AWS Lambda, Salesforce

Other

Data Analytics, Data Engineering, Recommendation Systems, Machine Learning, Data Quality Analysis, Web Applications, Protocol Buffers, ETL Tools, Physics, FastAPI, Streaming Data, Data Build Tool (dbt), AWS Cloud Architecture, Large Language Models (LLMs), Image Recognition, Pattern Recognition, Object Detection, Neural Networks, AI Agents, Artificial Intelligence (AI), Cloud Infrastructure, Technical Leadership, AWS ECS Fargate, Dashboards, APIs, API Gateways, Data Architecture, Solution Architecture, Mathematics, University Teaching

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring