Artem Ryzhikov, Developer in Batumi, Adjara, Georgia
Artem is available for hire
Hire Artem

Artem Ryzhikov

Verified Expert  in Engineering

Machine Learning Developer

Location
Batumi, Adjara, Georgia
Toptal Member Since
February 22, 2022

Artem holds a PhD in machine learning (ML) and has seven years of experience in data structures as well as six years of ML research, two years of working at tech startups, and four years of team management. As a senior ML engineer, he has built a CV algorithm for a mobile app that reached second place in the App Store in a week with 1.5 million downloads. Artem has also worked on various other projects, including recommendation and search systems, text processing, and conventional data science.

Portfolio

National Research University Higher School of Economics
Python, Computer Vision, Generative Adversarial Networks (GANs)...
Snapchat
Computer Vision, OpenCV, TensorFlow, Python, C++
Double Data
Scala, Python, Spark, PySpark, Machine Learning, Search, Social Networks...

Experience

Availability

Part-time

Preferred Environment

Linux, Vim Text Editor, Jupyter Notebook, Git

The most amazing...

...project I've developed is a mobile app that became the second-best mobile app in the App Store in a week. It was sold to Snapchat along with a team.

Work Experience

Research Fellow

2017 - PRESENT
National Research University Higher School of Economics
  • Learned how to and conducted machine learning (ML) research in anomaly detection, time series, and domain adaptation algorithms. Designed and implemented new algorithms. Compared them with the existing methods.
  • Wrote scientific articles about my new ML algorithms. Became the primary author of six impacting ML articles in top ML journals and a member of the Large Hadron Collider beauty (LHCb) collaboration.
  • Conducted ML lectures and seminars. Co-authored several ML, deep learning (DL), and generative adversarial network (GAN) courses at Coursera and university. Developed communication skills by giving public speeches to a broad audience.
  • Designed and implemented novel model-agnostic anomaly augmentation technique for tabular and image data that reached ROC AUC boost up to 0.08 higher than state-of-the-art methods in four out of six datasets. It was published in a top ML journal.
  • Designed a novel anomaly detection algorithm for tabular and image data that reached ROC AUC boost up to 0.1 higher than state-of-the-art methods in five out of six datasets. It was published in the Journal of Machine Learning Research.
  • Designed and implemented a novel DL change point detection algorithm for multivariate data. The algorithm boosted the change point score up to eight times compared to the existing state-of-the-art models in six out of six benchmarks.
  • Applied Bayesian sparsification of classification models, making them 16 times faster with no quality decrease. Deployed the C++ model to the LHCb pipeline. The paper was published in conference proceedings.
  • Designed and implemented a domain adaptation technique for effectively fitting DL models on synthetic data with no overfit. The results were published in conference proceedings.
  • Designed and implemented a domain adaptation technique for effectively fitting DL models on a small fraction of domains that existed in the training dataset. The results were published in conference proceedings.
  • Managed three research projects with up to six people in a team. The research included the design of a BERT-based algorithm for semi-supervised topic modeling.
Technologies: Python, Computer Vision, Generative Adversarial Networks (GANs), Bayesian Inference & Modeling, Anomaly Detection, Machine Learning, Deep Learning, PyTorch

Senior Machine Learning Engineer

2017 - 2017
Snapchat
  • Designed and tested semantic segmentation algorithms to split background from a person on a selfie. Conducted experiments with postprocessing. The segmentation quality evaluated in mean intersection-over-union (IoU) was improved from 0.93 to 0.98.
  • Worked on neural networks speedup and quantization. The existing segmentation models were sped up nine times.
  • Implemented a real-time version of the algorithm to perform the segmentation on video at a speed of over 30 frames per second (FPS).
  • Applied hair coloring using generative models and other fancy filters and masks implemented on C++.
  • Helped with the deployment of a neural network to mobile devices. The app reached second place in the App Store in a week with 1.5 million downloads. It was sold to Snapchat along with a team.
Technologies: Computer Vision, OpenCV, TensorFlow, Python, C++

Data Scientist | Scala Developer

2016 - 2017
Double Data
  • Implemented the first people search engine in three months working in a small team of three developers. Reached search quality with 54.5% recall at 99.9% precision and gained a lot of experience with test-driven development using JUnit and Mockito.
  • Implemented and optimized Spark jobs for over 200TB data processing. Configured Spark for efficient batch processing.
  • Conducted data analytics tasks for the business. Learned to present and visualize the results.
  • Trained and evaluated ML models for credit scoring, reaching a 40% bad rate.
Technologies: Scala, Python, Spark, PySpark, Machine Learning, Search, Social Networks, Scikit-learn, Elasticsearch, PostgreSQL, Amazon Web Services (AWS), Flask

Data Scientist | Data Engineer

2015 - 2016
VK group
  • Optimized the existing collaborative filtering (CF) recommendation algorithms to speed up to five times, which led to a 45% click-through rate (CTR) increase. The product was sold to VK Group along with a team.
  • Designed and implemented an effective users segmentation algorithm for a cold-start problem, which led to a 70+% CTR increase, making it just 0.09% smaller than CF.
  • Implemented new effective recommendations, sentiment analysis, and topic modeling algorithms.
  • Made almost all the recommendation algorithms to be personalized and real-time and to be updated after every user interaction.
  • Refactored the entire legacy Scala code of the recommendation engine, which was around 4,000 lines long.
  • Applied data science analytics for the business and management.
  • Designed a distributed data storage architecture and data flow. Sped up data loading three times.
  • Managed a machine learning research team. Built sandbox, AB-testing, and CI for recommendation algorithms testing.
Technologies: Python, Scala, Spark, Elasticsearch, Redis, Cassandra, PostgreSQL, Scikit-learn, Amazon Web Services (AWS), DataStax, RabbitMQ

PyTorch ARD

https://github.com/HolyBayes/pytorch_ard
PyTorch Conv2d and linear layers with a build-in Bayesian sparsification make any neural network 300 times faster. The implementation automatically reduces the number of parameters during the training process.

The project is based on a "Variational Dropout Sparsifies Deep Neural Networks" paper.

PyTorch Implementation of TIRE

https://github.com/HolyBayes/TIRE_pytorch
This is an unofficial PyTorch implementation of TIRE. TIRE is an autoencoder-based change point detection algorithm for time series data that uses a time-invariant representation.

More information can be found in the 2020 preprint "Change Point Detection in Time Series Data Using Autoencoders with a Time-invariant Representation."

PyTorch Implementation of KL-CPD

https://github.com/HolyBayes/klcpd
This is an unofficial PyTorch implementation of KL-CPD, an algorithm for time-series change point and anomaly detection.

More information can be found in the 2019 paper "Kernel Change-point Detection with Auxiliary Deep Generative Models."

EM Algorithm with Automatic Relevance Determination

https://github.com/HolyBayes/ard-em
This is a Bayesian modification of the expectation-maximization (EM) algorithm with an automatic components number determination. The conventional EM algorithm for reconstructing a mixture of normal distributions does not allow determining the number of mixture components. The ARD EM implementation suggests an algorithm for automatically determining the number of components of ARD EM based on the method of relevant vectors. The idea of the algorithm is to use, at the initial stage, a knowingly excessive amount of the components of the mixture with a further determination of the relevant components by maximizing validity.

Languages

Python, Scala, SQL, C++

Libraries/APIs

PyTorch, Pandas, NumPy, Scikit-learn, OpenCV, TensorFlow, PySpark

Tools

Git, RabbitMQ, DataStax

Paradigms

Data Science, Anomaly Detection

Other

Machine Learning, Deep Learning, Computer Vision, Experimental Design, Algorithms, Generative Adversarial Networks (GANs), Bayesian Inference & Modeling, Time Series Analysis, Data Structures, Recommendation Systems, Natural Language Processing (NLP), Domain Adaptation, Search, Social Networks, GPT, Generative Pre-trained Transformers (GPT)

Frameworks

Spark, Flask

Platforms

Linux, Amazon Web Services (AWS)

Storage

Elasticsearch, PostgreSQL, Redis, Cassandra

2017 - 2021

PhD Degree in Computer Science

National Research University HSE - Moscow, Russia

2015 - 2017

Master's Degree in Computer Science

National Research University HSE - Moscow, Russia

2011 - 2015

Bachelor's Degree in Physics

Novosibirsk State University - Novosibirsk, Russia

MAY 2017 - PRESENT

Yandex School of Data Analysis

Yandex

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring