Mikhail Gurevich, Developer in Rostov-on-Don, Rostov Oblast, Russia
Mikhail is available for hire
Hire Mikhail

Mikhail Gurevich

Verified Expert  in Engineering

Machine Learning Developer

Location
Rostov-on-Don, Rostov Oblast, Russia
Toptal Member Since
February 9, 2021

Mikhail has a degree in computer science and received a certification from MADE: Academy of Big Data. He has over 10 years of overall experience handling complex data in finance across different industries. Mikhail has two years of machine learning and data science experience with a focus on neural networks, particularly CV and NLP.

Availability

Part-time

Preferred Environment

MacOS, Visual Studio Code (VS Code), Git, PyTorch, Pandas, Deep Learning, Machine Learning, NumPy, Seaborn

The most amazing...

...model I've developed is an end-to-end solution for a car plate recognition of real life pictures of cars. It incorporates advanced CV and NLP techniques.

Work Experience

Data Science Engineer

2019 - PRESENT
Gremion
  • Developed an MVP of analytical service and took part in the initial hypothesis testing process.
  • Developed a cloud-based real-time data analysis system for data gathered from sensors installed on agricultural equipment.
  • Ensured that models used in the system provide a good basis for the data-driven decision-making process of our clients.
  • Took part in the launch of the whole system with real customers (each customer is an agricultural business).
Technologies: Python, Pandas, PostgreSQL, Amazon Simple Queue Service (SQS), Datadog, Machine Learning, Complex Data Analysis

Chief Finance Officer

2017 - 2020
uKit Group
  • Served as the head of the financial department of a medium-sized IT company.
  • Improved the entire financial reporting process in the group with subsidiaries in three different countries.
  • Communicated with external auditors in the jurisdictions where the audit is obligatory.
Technologies: Complex Data Analysis, Finance, Management

Gremion

A Python-based agrotech project which provides agriculture companies management with information on the quality of soil tillage. I was responsible for the development of an analytical service that:
- Gathered information from a number of sensors installed on the agriculture equipment (plows, cultivators, tractors): GNSS data, accelerometers, gyroscopes, etc.
- Worked with this data as a time series. This step involves filtering out noise from the data and normalize the data across the time axis (different sensors have different data frequencies).
- Analyzed the data using heuristics alongside decision trees and linear classifiers
- Prepared reports for management using Seaborn and Plotly
- Used PostgreSQL as storage for raw, normalized, and processed data
- Used pandas, NumPy, and GeoPy to process the data

PovarGAN

A Python-based application that generates images of food based on the recipe and ingredients list. I was responsible for choosing neural network architecture (GANs) for our model and took part in the experiments during the model training process.

I also proposed and implemented a novel technique based on a few papers from https://arxiv.org/. We used multimodal learning for the generated image quality improvement. Particularly, we built representations in one feature space for texts and images and then trained the model using triplet loss to classify (text, image) pairs.

Later on, this classifier was used as an additional term in the generator loss.

Car Plate Recognizer

A PyTorch-based end-to-end model that finds all the car plates on the photo and recognizes text on them. The dataset contained 25,000 real-life photos of cars in different cities. Each photo could have contained several cars.

I made a pipeline which consisted of the following parts:

1) Mask R-CNN (https://arxiv.org/abs/1703.06870), a model that detects all car plates presented in the photo.
2) Preprocessing of car plate images using OpenCV library, adjusting blur and the contrast of the images in order to normalize them before passing them to OCR.
3) Char-RNN (https://arxiv.org/abs/1706.01069) a model specifically designed for OCR.
4) An additional step was made to increase the quality of OCR based on the knowledge of the car plate's text structure. I implemented and trained a language model based on the Beam Search.

This model achieved a Levenshtein Mean of 1.05 on the test dataset. As the company disclosed afterward, the test dataset contained car plates from different regions, so test data had a different distribution from the train data.

Languages

Python, C++

Paradigms

Management, Data Science, RESTful Development

Other

Computer Vision, Natural Language Processing (NLP), EDA, Complex Data Analysis, Finance, GPT, Generative Pre-trained Transformers (GPT), Deep Learning, Machine Learning, Deep Reinforcement Learning, Statistics, Bayesian Statistics, Applied Mathematics, Artificial Intelligence (AI), Discrete Mathematics, Calculus, Probability Theory, Generative Adversarial Networks (GANs), OCR

Libraries/APIs

PyTorch, Pandas, NumPy, REST APIs

Tools

Git, Seaborn, Amazon Simple Queue Service (SQS)

Platforms

MacOS, Visual Studio Code (VS Code), Amazon Web Services (AWS)

Storage

PostgreSQL, Datadog

2003 - 2008

Bachelor's Degree in Informatics and Applied Mathematics

South Federal University - Rostov-on-Don, Russia

DECEMBER 2020 - PRESENT

Data Scientist

MADE: Big Data Academy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring