Adam Ivansky, Data Engineering Developer in Toronto, ON, Canada
Adam Ivansky

Data Engineering Developer in Toronto, ON, Canada

Member since September 15, 2018
Adam has six years of experience in data engineering and data science. His tools of choice include Python 3, Spark, and SQL. His main focus areas include ETLs and machine learning marketing pipelines. Adam is able to effectively communicate with both highly technical and non-technical specialists.
Adam is now available for hire

Portfolio

  • Fortune 500 Company
    Jenkins, AWS CLI, AWS S3, Redshift, Python 3, Spark, AWS EMR
  • eBay
    TensorFlow, Scikit-learn, Tableau, PySpark, Apache Hive, Python, Teradata SQL
  • Valeo
    Protocol Buffers, Intel TBB, C++, OpenCV, SQL, MATLAB, Python

Experience

Location

Toronto, ON, Canada

Availability

Part-time

Preferred Environment

VS Code, Linux, Windows, iOS

The most amazing...

...project I have worked on was the development of an image recognition algorithm for self-driving vehicles.

Employment

  • Data Engineer

    2019 - PRESENT
    Fortune 500 Company
    • Developed an ETL pipeline based on PySPARK running on AWS EMR for the extraction of data from Redshift to S3.
    • Contributed to a product recommendation engine based on SPARK ML.
    • Developed data quality assessment tool.
    • Managed EMR cluster creation/termination in AWS CLI and AWS console.
    • Automated a marketing pipeline in Jenkins.
    • Contributed to the algorithm for identification of new prospective members based on 3rd party data.
    Technologies: Jenkins, AWS CLI, AWS S3, Redshift, Python 3, Spark, AWS EMR
  • Senior Database Marketing Analyst

    2017 - 2018
    eBay
    • Developed targeting scripts for flagship marketing campaigns with an emphasis on email, mobile push notification, social, and on-site channels. The campaigns often targeted over 50 million users and sometimes resulted in over $100,000 in iGMB annually.
    • Designed, developed, implemented, and maintained multi-armed bandit algorithms written in Python while adhering to marketing standards and processes within eBay. The algorithm was measured to generate $5 mil. annually.
    • Trained an algorithm for send-time optimization. This has resulted in a 15% increase in click-through-rate in campaigns where it was implemented.
    • Assessed existing email, social, and mobile marketing campaigns in terms of KPIs such as iGMB, OR, and CTR.
    • Created dashboards in Tableau that reported on the performance of different marketing algorithms I have created.
    • Created scripts that moved data between HIVE and Teradata servers.
    • Worked with the largest Teradata DWH in the world and often queried tables with 100+ billion rows.
    • Communicated with stakeholders across multiple timezones.
    Technologies: TensorFlow, Scikit-learn, Tableau, PySpark, Apache Hive, Python, Teradata SQL
  • Machine Learning SW Developer

    2016 - 2017
    Valeo
    • Developed and trained a machine vision algorithm for recognition of pedestrians in front of a vehicle. The algorithm has since been implemented in a number of vehicle models including the GM 2019 Chevy.
    • Trained and algorithm for detection of dirt on the camera lens. This algorithm had a crucial role in supporting other more complex self-driving functionalities.
    • Assessed the quality of unstructured annotated video data used for algorithm training.
    • Created a script for synchronization of both structured and unstructured data between multiple teams who participated on the project.
    • Attended a computer science conferences and studied scientific literature to keep up-to-date with new trends in machine learning and computer science. Knowledge exchange with other team-members.
    • Communicated and networked with teammates and stakeholders from France and Ireland.
    Technologies: Protocol Buffers, Intel TBB, C++, OpenCV, SQL, MATLAB, Python
  • Credit Risk Analyst

    2014 - 2015
    Erste Group
    • Calculated risk parameters CCF, LGD and PD according to BASEL 2.
    • Reduced the overall reserve requirements of Erste Bank subsidiaries by over 7 % thanks to the improvements in the statistical engine for calculation of risk parameters CCF, LGD and PD that I have introduced.
    • Designed and trained a mathematical model in SAS for prediction of the overall loss in the event of a client default. This helped Erste improve the repossession process and reduce expenses.
    • Performed ad-hoc stress-tests for Erste subsidiaries. The results were later submitted directly to the European National Bank.
    • Assessed of risk portfolio stability via bootstrapping and monte-carlo methods.
    • Created interactive dashboards for risk parameter reporting in MS SQL and Excel.
    • Developed a data quality testing system.
    Technologies: Microsoft Excel, MATLAB, Microsoft SQL Server, SAS
  • Teaching and Research Assistant

    2012 - 2014
    University of Rochester
    • Led lab lectures for undergraduate students.
    • Developed software for automation of experiments and analyzed data produced by the experiments.
    • Wrote several scientific papers that are available online.
    Technologies: MATLAB

Experience

  • eBay App Push Notification Send Time Optimization Project (Development)

    - The aim of the project was to improve the click-through rates of mobile push notifications
    - The introduction of the algorithm resulted in 15% improvement in the Mobile Push Notification Click-Through-Rate
    - I decided to achieve this by developing a machine learning algorithm that predicted the optimum contact time for every user
    - The algorithm was developed in Python and was trained using scikit-learn
    - Obtaining training data required the use of HIVE and PySPARK
    - I have successfully implemented the algorithm into the marketing production environment and instructed marketing analysts on how to use it

  • Model for Dynamic Content Optimization and Customization (Development)

    - The aim of the project was to increase the click-through rate of eBay coupon campaigns via the use of machine learning
    - The development of the algorithm was successful and it was measured to generate 20% lift in click-through-rate and iGMB
    - The early version of the algorithm was based on the multi-armed bandit. Later versions made use of contextual NLP-based multi-armed bandit
    - The algorithm was developed using a combination of Teradata SQL and Python
    - I have also developed an interactive Tableau dashboard in order to monitor the function of the algorithm and to measure the KPI lift that the algorithm was bringing

  • Model for Pedestrian Detection Intended for Self-driving Vehicles (Development)

    - The aim of the project was to develop a machine vision algorithm capable of detecting pedestrians in front of a vehicle by analyzing the input from the vehicle camera
    - The algorithm is now fully functional and is embedded into a number of newer vehicle models including the GM 2019 Chevy
    - The machine learning algorithm we decided to use was ada-boost cascade classifier combined with deep neural network
    - We wrote the training application from scratch in C++. Training had to be multithreaded in order to be efficient.
    - Testing and validation was done in Python
    - A large database of annotated video data was used for algorithm training

  • Model for Prediction of Loss Given Default (Development)

    - Precise prediction of the total final loss after the default of a client is a key to reducing risk associated with different loan products
    - I have developed a model that relied on the loan to value ratio and the value of the collateral
    - Development was done using a combination of SAS and MS SQL
    - Development of the model required extensive cleaning of the data and data quality testing

  • Product recommendation algorithm (Development)

    - Involved development of a recommendation engine based on collaborative filtering model.
    - The engine was capable of recommending even the products that a given customer did not necessarily buy in the past
    - The solution was implemented in PySPARK and was based on SPARK ML

  • ETL for Recommendation Algorithm (Development)

    - Developed an ETL in PySPARK for transfer of data from Redshift into an S3 data lake
    - Developed code for customer-level data aggregation and historization
    - Assessed data quality, investigated and remediated data quality issues

Skills

  • Languages

    SQL, Python 3, Python 2, C++14, Python, C++, SAS
  • Frameworks

    AWS EMR, Spark, Hadoop
  • Libraries/APIs

    PySpark, Sklearn, Scikit-learn, TensorFlow, OpenCV, Intel TBB, AWS EC2 API
  • Tools

    Apache Airflow, Git, Spark SQL, AWS Glue, Bitbucket, VS Code, Tableau, MATLAB, Microsoft Excel, Jenkins, AWS CLI
  • Paradigms

    Unit Testing, Agile, Continuous Integration (CI)
  • Storage

    AWS S3, Teradata, Redshift, Microsoft SQL Server, Apache Hive
  • Industry Expertise

    Marketing
  • Other

    Data Analytics, Data Engineering, Recommendation Systems, Machine Learning, Data Quality Analysis, Deep Learning, Teradata SQL, Protocol Buffers, ETL Tools
  • Platforms

    iOS, Windows, Linux, AWS EC2, Spark Core

Education

  • Master of Science degree in Physics
    2012 - 2014
    University of Rochester - New York, USA
  • Bachelor's degree in Physics
    2008 - 2012
    National University of Ireland, Galway - Galway, Ireland

To view more profiles

Join Toptal
Share it with others