Karen is available for hire

Karen Danielyan

Verified Expert in Engineering

Data Scientist and Developer

Location

Tallinn, Estonia

Toptal Member Since

July 6, 2022

Karen is an experienced data scientist and machine learning engineer with a solid background in mathematics and statistics and a proven work track record in the fintech and ad-tech industries. Throughout his career, he has embraced different projects, including building and deploying various models for risk scoring and advertisement-exchange optimization pipelines. Karen is keen on handling big data and machine learning pipelines from data integration to final model deployment.

Machine Learning Python Pandas NumPy Data Analysis Calculus Time Series Analysis Data Analytics Time Series Data Mining Data Reporting Deep Learning Artificial Intelligence (AI)A/B Testing Jupyter Statistics OpenAI Embeddings

Portfolio

NIC MAP Vision LLC

Python, Data Science, SQL, Data Analysis, Machine Learning, NumPy, TensorFlow...

First Community Bank - Main

Machine Learning, Python, SQL, Microsoft SQL Server, Credit Risk

Tadpull, Inc.

Python, Azure, Real-time Data, Time Series Analysis

Experience

Statistics - 8 years Python - 6 years SQL - 6 years Machine Learning - 5 years Data Science - 5 years Deep Learning - 4 years Neural Networks - 4 years OpenAI GPT-3 API - 1 year

Availability

Part-time

Preferred Environment

PyCharm, SQL Management Studio, Linux, Windows, MacOS

The most amazing...

...solution I've developed is a credit-risk machine learning pipeline that significantly brought down the default rates.

Work Experience

Data Scientist | ML Engineer

2022 - 2024

NIC MAP Vision LLC

Created a regression model to predict senior housing trends, then packaged this into an API.
Developed a custom KNN model with tailored distance metrics, including geographic calculations. Did custom implementation on top of NumPy to handle and process large datasets for precise outcomes efficiently.
Enhanced a TensorFlow-based NLP model and built integrations with OpenAI GPT and Embedding APIs for new use cases.

Technologies: Python, Data Science, SQL, Data Analysis, Machine Learning, NumPy, TensorFlow, PyTorch, Pandas, Jupyter, Docker, Git, OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, Embeddings from Language Models (ELMo)

Machine Learning Developer

2023 - 2023

First Community Bank - Main

Developed a machine learning pipeline for Credit Risk.
Performed customer segmentation based on their default risk and profitability.
Performed ad-hoc analysis on the Credit Risk features and profitability data.

Technologies: Machine Learning, Python, SQL, Microsoft SQL Server, Credit Risk

Data Scientist

2022 - 2023

Tadpull, Inc.

Developed and implemented anomaly detection models, accounting for historical data and seasonal traffic changes, resulting in real-time alerts for ad spend, return on ad spend, product sell-thru rate, and high-traffic pages with low transaction rates.
Created propensity models to predict customer purchases, repurchases, and churn using individual attributes/actions (e.g., past purchases, location, website behavior), enhancing targeted marketing efforts.
Created forecasting models to accurately predict website traffic, revenue, and projected sellout dates, considering seasonality and enabling better inventory management and marketing strategies.

Technologies: Python, Azure, Real-time Data, Time Series Analysis

Machine Learning Engineer

2022 - 2022

Verve Group

Maintained a framework for utilizing daily data of 120 billion auctions from various partner companies and associated demand and supply partners.
Built a framework to train traffic-shaping and bid-optimizer models and deploy them on the ad-server hourly.
Conducted ad-hoc performance analysis, gathering insights from a vast dataset.

Technologies: Python, Spark ML, Flask, GitHub, Apache Airflow, Docker, Kubernetes, Machine Learning, Big Data Architecture, Redis, Amazon Web Services (AWS), PyCharm, Linux, Statistical Modeling, Data Science, SQL, Git, Algorithms, Jupyter Notebook, Data Analysis, Data Visualization, Machine Learning Operations (MLOps), Data Mining, Jupyter

Data Scientist

2019 - 2022

Bondora

Built, deployed, and maintained a machine learning pipeline from data integration to a Docker image and API service in production.
Performed ad-hoc data mining, hypothesis testing, and insight gathering for management decision-making.
Analyzed A/B testing results using statistical methods.

Technologies: Python, SQL, Spark ML, Dask, BentoML, GitLab CI/CD, Docker, Kubernetes, Git, Machine Learning, Data Science, Statistical Modeling, Artificial Intelligence (AI), Natural Language Toolkit (NLTK), GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Amazon Web Services (AWS), Spark, Financial Modeling, Mathematics, PyCharm, Neural Networks, Deep Learning, Statistics, Algorithms, Jupyter Notebook, Data Analysis, A/B Testing, Data Visualization, Probability Theory, Machine Learning Operations (MLOps), Time Series, Time Series Analysis, Data Mining, Data Modeling, Data Reporting, Databricks, Jupyter

Data Scientist

2020 - 2020

Deloitte Audit Analytics

Assisted in developing ranking algorithms for companies based on random forest and clustering.
Performed data migration from Azure to GCP, created a sustainable database structure, data processing pipelines, and ETL, and developed analytical tools.
Gathered ad-hoc analytical insights from collected data.

Technologies: Azure, Python, Google Cloud, SQL, Machine Learning, Data Science, Data Analytics, Google BigQuery, Financial Modeling, PyCharm, Statistical Modeling, Statistics, Stochastic Modeling, Derivative Pricing, Git, GitHub, Jupyter Notebook, Data Analysis, Data Visualization, Data Mining, Data Modeling, Data Reporting, Jupyter

Data Specialist

2017 - 2019

One Market Data

Performed tick data collection, processing, and storing for equity and futures markets and created a final sellable product with the Quant Data team.
Covered the data processing for more than 200 exchanges.
Developed various SQL stored procedures and functions for managing a constant data flow.
Supported the normalization of the very granular exchange data as well as processing to fit the clients' needs like back testing of trading strategies, using the composite exchange prices for validation, order book creation, and more.

Technologies: MySQL, Linux, Shell, Python, Perl, Financial Modeling, Big Data Architecture, SQL, Git, GitHub, Algorithms, Trading, Jupyter Notebook, Data Analysis, Data Visualization, Data Mining, Jupyter

Experience

Credit Risk Modeling Pipeline

An API solution that scores loan applicants' credibility based on several data sources, including user input variables, credit bureau data, and bank statement transactions. The latter part contains an NLP model to predict expense and income categories based on the plain bank transaction descriptions. The project contains the initial data ETL, feature engineering, modeling, evaluation, and deployment.

Traffic Shaping Framework

The project is designed to shape the ad traffic on an auction level. The demand and supply sides trade ads on online auctions that happen around 120 billion times a day. The project uses machine learning algorithms and data processing tools to determine the most probable bidders for the ad supply side and not waste traffic on the cloud by sending bid requests to improbable bidders given the ad characteristics. The project is developed mainly in Python.

Sustainable Finance Project

The project aims to find a connection between global companies' carbon footprint and market risk. The Climate Finance or Sustainable Finance project includes a compound data pipeline, which uses several databases from the partner companies, extracts the relevant data, transforms, and loads in a Google BigQuery. The second part of the project provides a decision-making platform that leverages machine learning models to find a relation and causation between several ESG parameters of the companies and their balance sheets and cash flow statements. Thus, eventually calculates the market risk for those companies.

Customer Segmentation for Targeted Marketing Strategies

Developed a tree-based segmentation pipeline to optimize marketing efforts by identifying and targeting high-value customer segments. This project involved extensive feature engineering, implementing a grid search, and creating a reusable pipeline. The result was a highly effective tree traversal system that pinpointed the best and worst customer segments for retargeting, enabling the company to allocate resources more efficiently and drive higher returns on marketing investments.

Education

2018 - 2020

Master's Degree in Mathematics and Statistics

University of Tartu - Tartu, Estonia

2012 - 2018

Bachelor's Degree in Mathematics and Statistics

Yerevan State University - Yerevan, Armenia

Skills

Languages

Python, SQL, R, Perl

Libraries/APIs

Scikit-learn, Pandas, NumPy, Keras, Spark ML, Dask, Natural Language Toolkit (NLTK), TensorFlow, PyTorch

Paradigms

Data Science, ETL

Other

Machine Learning, Statistical Modeling, Statistics, Probability Theory, Predictive Modeling, Regression, Mathematics, Deep Learning, Stochastic Modeling, Derivative Pricing, Calculus, Linear Algebra, Analytical Geometry, Differential Equations, Time Series Analysis, Stochastic Differential Equations, Data Analytics, Artificial Intelligence (AI), Financial Modeling, Data Analysis, A/B Testing, Machine Learning Operations (MLOps), Time Series, Data Mining, Data Modeling, Data Reporting, OpenAI GPT-3 API, Neural Networks, Big Data Architecture, Cloud Computing, Option Pricing, BentoML, Natural Language Processing (NLP), Optimization, Google BigQuery, Algorithms, Trading, Data Visualization, GPT, Generative Pre-trained Transformers (GPT), Credit Risk, Real-time Data, OpenAI GPT-4 API, LangChain, Embeddings from Language Models (ELMo)

Tools

PyCharm, SQL Management Studio, Jupyter, MATLAB, GitLab CI/CD, Git, GitHub, Apache Airflow, Shell

Platforms

Jupyter Notebook, Linux, Docker, Kubernetes, Azure, Windows, MacOS, Google Cloud Platform (GCP), Amazon Web Services (AWS), Databricks

Storage

MySQL, Redis, Google Cloud, Microsoft SQL Server

Frameworks

Flask, Spark

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring