
Adam Ivansky
Verified Expert in Engineering
Machine Learning Developer
Buffalo, NY, United States
Toptal member since November 6, 2018
As a highly experienced senior data engineer and tech lead, Adam specializes in designing and building large-scale data pipelines, data warehouses, and robust data infrastructure. With a strong technical foundation in Python, Snowflake, PostgreSQL, and Terraform he has primarily worked with AWS to deliver scalable and efficient solutions. In addition to my data engineering expertise, Adam has developed several web back-end solutions leveraging Django and FastAPI, further expanding his skill set.
Portfolio
Experience
- SQL - 9 years
- Python 3 - 6 years
- Spark - 4 years
- Marketing - 4 years
- Machine Learning - 3 years
- Amazon Elastic MapReduce (EMR) - 3 years
- Data Engineering - 3 years
- Recommendation Systems - 2 years
Availability
Preferred Environment
Amazon Web Services (AWS), Python, Terraform, Snowflake, PySpark, Amazon Elastic Container Service (ECS), ETL, Django, FastAPI, Streaming Data
The most amazing...
...project I've worked on is developing a Lambda-based ETL pipeline to process 1TB of streamed telemetry data every day.
Work Experience
Senior Data Engineer
Endeavor
- Built and designed both batch and streaming ETL pipelines to facilitate the movement of data in and out of the Snowflake data warehouse.
- Made strategic decisions regarding the selection of technology and architecture design.
- Administered the Snowflake account and managed objects using dbt.
- Provisioned resources such as EKS, S3, Lambda, and Transfer Family in AWS accounts using Terraform.
- Designed and built REST APIs and internal websites to expose marketing models across the organization, integrating them with Amazon Cognito and Microsoft Active Directory.
- Integrated Mixtral 8x7b and Snowflake Cortex LLMs into the company data architecture and designed and developed RAG databases for company-specific marketing.
- Developed standards and formalized development processes.
Data Engineering Tech Lead
Apple
- Served as a data engineer in charge of two projects end-to-end. The projects involved collecting data from 3rd-party cloud vendors.
- Developed scheduled ETLs based on Python and Spark that collected data from various APIs and loaded the data to Amazon S3 and PostgreSQL databases. The ETLs were deployed to Airflow and Kubernetes.
- Built a number of APIs that were exposing data from the data warehouse to consumers of the data.
- Created and modified ETLs based on AWS Glue. Created a serverless ETL based on Amazon SQS and AWS Lambda.
Data Engineer
BJ's Wholesale Club
- Developed an ETL pipeline based on PySpark running on Amazon EMR for the extraction of data from Redshift to S3.
- Contributed to a product recommendation engine based on Spark machine learning.
- Developed a data quality assessment tool in PySpark.
- Owned cloud cost reporting. Managed EMR cluster creation/termination in AWS CLI and AWS console.
- Automated the entire ETL/marketing pipeline using Jenkins.
- Contributed to the algorithm for identifying new prospective members based on 3rd-party data.
Senior Database Marketing Analyst
eBay
- Developed targeting scripts for flagship marketing campaigns with an emphasis on email, mobile push notification, social, and on-site channels. The campaigns often targeted over 50 million users and sometimes resulted in over $100,000 in iGMB annually.
- Designed, developed, implemented, and maintained multi-armed bandit algorithms written in Python while adhering to marketing standards and processes within eBay. The algorithm was measured to generate $5 million annually.
- Trained an algorithm for send-time optimization. This has resulted in a 15% increase in click-through-rate in campaigns where it was implemented.
- Assessed existing email, social, and mobile marketing campaigns in terms of KPIs such as iGMB, OR, and CTR.
- Created dashboards in Tableau that reported on the performance of different marketing algorithms I developed.
- Created scripts that moved data between HIVE and Teradata servers.
- Worked with the largest Teradata DWH in the world and often queried tables with 100+ billion rows.
- Communicated with stakeholders across multiple time zones.
Machine Learning SW Developer
Valeo
- Developed and trained a machine vision algorithm for recognizing pedestrians in front of vehicles, which has been implemented in several vehicle models, including the GM 2019 Chevy.
- Trained an algorithm to detect dirt on camera lenses. This algorithm had a crucial role in supporting other more complex self-driving functionalities.
- Assessed the quality of unstructured annotated video data used for algorithm training.
- Created a script for synchronization of both structured and unstructured data between multiple teams who participated on the project.
- Attended computer science conferences and studied scientific literature to keep up with new machine learning and computer science trends. Engaged in knowledge exchange with other team members.
- Communicated and networked with teammates and stakeholders from France and Ireland.
Credit Risk Analyst
Erste Group
- Calculated risk parameters CCF, LGD, and PD according to BASEL 2.
- Reduced the overall reserve requirements of Erste Bank subsidiaries by over 7% thanks to the improvements in the statistical engine for calculation of risk parameters CCF, LGD, and PD that I have introduced.
- Designed and trained a mathematical model in SAS for prediction of the overall loss in the event of a client default. This helped Erste improve the repossession process and reduce expenses.
- Performed ad-hoc stress tests for Erste subsidiaries. The results were later submitted directly to the European National Bank.
- Assessed risk portfolio stability via bootstrapping and Monte Carlo methods.
- Created interactive dashboards for risk parameter reporting in Microsoft SQL and Excel.
- Developed a data quality testing system in SAS and SQL.
Teaching and Research Assistant
University of Rochester
- Conducted teaching and lab lectures for undergraduate students.
- Developed software for the automation of experiments and analyzed data produced by the experiments.
- Authored several scientific papers that are available online.
Experience
Model for Dynamic Content Optimization and Customization
The early version of the algorithm was based on the multi-armed bandit. Later versions made use of contextual NLP-based multi-armed bandit. The algorithm was developed using a combination of Teradata SQL and Python. I also developed an interactive Tableau dashboard in order to monitor the function of the algorithm and to measure the KPI lift that the algorithm was bringing.
Model for Pedestrian Detection Intended for Self-driving Vehicles
The machine learning algorithm we decided to use was the AdaBoost cascade classifier combined with a deep neural network. We wrote the training application from scratch in C++. Training had to be multithreaded in order to be efficient. Testing and validation were done in Python. A large database of annotated video data was used for algorithm training.
Prediction Model
I developed a model that relied on the loan-to-value ratio and the value of the collateral. It was done using a combination of SAS and Microsoft SQL Server. The development of the model required extensive data cleaning and data quality testing.
Product Recommendation Algorithm
ETL for Recommendation Algorithm
Education
Master of Science Degree in Physics
University of Rochester - New York, USA
Bachelor's Degree in Physics
National University of Ireland, Galway - Galway, Ireland
Certifications
AWS Certified Developer
AWS
AWS Certified Cloud Practitioner
AWS
Skills
Libraries/APIs
PySpark, Scikit-learn, TensorFlow, OpenCV, Intel TBB, Amazon EC2 API, Python API, PyTorch
Tools
Amazon Elastic MapReduce (EMR), Apache Airflow, Git, Spark SQL, AWS Glue, Bitbucket, Tableau, MATLAB, Microsoft Excel, Jenkins, AWS CLI, Amazon EKS, Amazon Simple Queue Service (SQS), Terraform, Amazon Elastic Container Service (ECS), GitHub, Prefect
Languages
SQL, Python 3, Python 2, C++14, Python, C++, SAS, Snowflake
Frameworks
Spark, Hadoop, Django
Paradigms
Unit Testing, Agile, Continuous Integration (CI), ETL, Database Design
Storage
Amazon S3 (AWS S3), Teradata, Redshift, Microsoft SQL Server, Apache Hive, PostgreSQL, Data Lakes
Industry Expertise
Marketing
Platforms
iOS, Windows, Linux, Amazon EC2, Spark Core, Docker, Kubernetes, Amazon Web Services (AWS), Visual Studio Code (VS Code)
Other
Data Analytics, Data Engineering, Recommendation Systems, Machine Learning, Data Quality Analysis, Deep Learning, Protocol Buffers, ETL Tools, Physics, FastAPI, Streaming Data, Data Build Tool (dbt), AWS Cloud Architecture, Large Language Models (LLMs), Image Recognition, Pattern Recognition, Object Detection, Neural Networks
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring