Ovchinnikov Grigoriy, Developer in Tbilisi, Georgia
Ovchinnikov is available for hire
Hire Ovchinnikov

Ovchinnikov Grigoriy

Verified Expert  in Engineering

Data Scientist and Developer

Tbilisi, Georgia

Toptal member since July 14, 2022

Bio

Ovchinnikov is a data scientist with over three years of data analysis and machine learning experience. He has a strong knowledge of classical ML algorithms, statistics, and proficiency while working with ML frameworks and databases for solving problems. Ovchinnikov is also well-versed in handling big data for modeling and data analysis.

Portfolio

Platforma
Hadoop, Supervised Machine Learning, Deep Neural Networks (DNNs), PyTorch...
S7 Airlines
Apache Airflow, PySpark, PostgreSQL, Machine Learning, Unsupervised Learning...

Experience

  • Data Science - 5 years
  • Python - 5 years
  • Unsupervised Learning - 4 years
  • Machine Learning - 4 years
  • Supervised Machine Learning - 4 years
  • PySpark - 3 years
  • Advertising Technology (Adtech) - 2 years

Availability

Part-time

Preferred Environment

PyCharm, Jupyter Notebook, GitLab, Ubuntu

The most amazing...

...projects I've worked on include solutions for several problems requiring thinking outside the box, which significantly positively impacted product development.

Work Experience

Middle Data Scientist

2020 - PRESENT
Platforma
  • Implemented state-of-the-art methods for a look-alike modeling task with positive unlabeled (PU) learning using PySpark and PyTorch with a significant uplift in offline and online metrics.
  • Built an ML model for labeling clickstream data with target device information using a weekly supervision paradigm which increased accuracy and lower inference time on telecom data in more than hundreds of million devices using PySpark and Hadoop.
  • Suggested and implemented an NLP-inspired approach solution for website (host) representation learning based on clickstream data, giving several advantages over the previous approach.
  • Developed a method for generating user interest vectors based on clickstream data embeddings.
Technologies: Hadoop, Supervised Machine Learning, Deep Neural Networks (DNNs), PyTorch, Positive-unlabeled (PU) Learning, Analytics, Personalization, Advertising Technology (Adtech), Scikit-learn, MongoDB, SQL, Interpretation, Gradient Boosting, Unsupervised Learning, PyCharm, Jupyter Notebook, PySpark, Python, Data Science

Junior Data Scientist

2018 - 2020
S7 Airlines
  • Built a decision support system based on clustering and semi-supervised multiclass classification using a traditional "tabular" machine learning stack.
  • Created a pipeline for scraping external open data with Python using mainly Scrapy, Airflow, and PostgreSQL.
  • Worked on a solution for predictive maintenance tasks on remaining useful life estimation.
  • Contributed to a big data tech stack by performing an exploratory data analysis and building MVP projects.
  • Built supervised ML regression models for predicting the number of sold tickets for flights.
Technologies: Apache Airflow, PySpark, PostgreSQL, Machine Learning, Unsupervised Learning, Supervised Machine Learning, Amazon S3 (AWS S3), NumPy, Pandas, Scikit-learn, Git, Docker, Web Scraping, Scrapy, PyCharm, Jupyter Notebook, MongoDB, Data Science, Python

Experience

ID R&D Facial Antispoofing Challenge

The competition's goal was to build a model that can anticipate to which class people's images belong: original frames (real) or the frames received when shooting screens with different mobile devices (spoof).

As a result, I built a solution to achieve a ROC AUC score greater than 0.99 based on siamese neural networks. My approach was amongst the top 6 best solutions.

Catch Blogger

Acted as a developer for a YouTube influence marketing platform. I was responsible for the whole data management pipeline, including extraction, transformation, and loading.
Pipeline stack: Python, Puppeteer, Google Cloud Platform (GCP) including Cloud Functions, Cloud Scheduler, and Cloud Storage.

Education

2014 - 2018

Bachelor's Degree in Computer Science

Volgograd State Technical University - Volgograd, Russia

Certifications

OCTOBER 2020 - PRESENT

Drawing Inferences from Data

Moscow Institute of Physics and Technology | Yandex | E-learning Development Fund

Skills

Libraries/APIs

Scikit-learn, NumPy, Pandas, PySpark, PyTorch

Tools

Apache Airflow, PyCharm, Git, GitLab

Languages

Python, SQL

Platforms

Jupyter Notebook, Docker, Ubuntu

Frameworks

Hadoop, Scrapy

Storage

PostgreSQL, Amazon S3 (AWS S3), MongoDB, Google Cloud

Paradigms

Agile Software Development, ETL

Other

Machine Learning, Gradient Boosting, Data Science, Unsupervised Learning, Supervised Machine Learning, Deep Neural Networks (DNNs), Positive-unlabeled (PU) Learning, Advertising Technology (Adtech), Interpretation, Web Scraping, Programming, Informatics, Applied Mathematics, Analytics, Statistics, Data Analysis, Hypothesis Testing, Personalization, Deep Learning, Computer Vision Algorithms, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring