Leon Kozinkin, Developer in Novosibirsk, Russia
Leon is available for hire
Hire Leon

Leon Kozinkin

Verified Expert  in Engineering

Machine Learning Developer

Novosibirsk, Russia

Toptal member since February 1, 2019

Bio

Leon is a skilled specialist with more than five years of scientific software development experience, strong mathematical background, and knowledge of fundamental CS algorithms. He's passionate about data science, deep learning, image processing, NLP, and big data. He's also capable of developing structured production-ready solutions from scratch. As a data science expert, Leon is ranked in the top 1% of competitors on Kaggle.

Portfolio

Freelance Work
Amazon Web Services (AWS), MySQL, PostgreSQL, PyTorch, TensorFlow, Keras, Git...
Sobolev Institute of Mathematics
Git, Subversion (SVN), MPI, C++, Pandas, Scikit-learn, Python, MATLAB
Sigma-Pro
Git, Subversion (SVN), Intel MKL, OpenCV, Python, C++, MATLAB

Experience

  • Python - 5 years
  • Image Processing - 4 years
  • Pandas - 4 years
  • Machine Learning - 4 years
  • Data Science - 4 years
  • Scikit-learn - 4 years
  • Generative Pre-trained Transformers (GPT) - 3 years
  • Deep Learning - 3 years

Availability

Part-time

Preferred Environment

PyCharm, Jupyter Notebook, Python, Git, Linux

The most amazing...

...solution I've worked on was a salt deposit identification neural network model which achieved 27th place out of 3,234 on Kaggle.

Work Experience

Data Scientist

2015 - PRESENT
Freelance Work
  • Developed computer vision real-time algorithms to track and count customers via surveillance cameras.
  • Implemented back-end RESTful web services utilizing machine learning pipelines.
  • Created deep learning models for image segmentation and classification problems.
  • Designed SQL database architectures and created and provided ORMs to machine learning pipelines.
  • Implemented Warp-CTC neural networks and applied to a real-time speech recognition problem.
Technologies: Amazon Web Services (AWS), MySQL, PostgreSQL, PyTorch, TensorFlow, Keras, Git, Natural Language Toolkit (NLTK), Scikit-learn, Pandas, OpenCV, Bottle.py, Flask, Python

Research Assistant

2014 - 2017
Sobolev Institute of Mathematics
  • Researched the charged particles unit-sphere self-organization problem and its applications.
  • Studied synchronization processes in chaotic dynamic systems.
  • Created statistical, machine learning models and numerical simulations of given phenomena.
  • Optimized and scaled developed algorithms.
  • Presented results at conferences and wrote articles and reports.
Technologies: Git, Subversion (SVN), MPI, C++, Pandas, Scikit-learn, Python, MATLAB

Research Engineer

2012 - 2016
Sigma-Pro
  • Researched and developed high-performance 3D tomographic particle image velocimetry algorithms.
  • Implemented the algorithms into the company scientific framework.
  • Processed experimental data and analyzed results.
  • Published novel results in scientific journals.
  • Supported, improved, and optimized the existing algorithmic framework.
Technologies: Git, Subversion (SVN), Intel MKL, OpenCV, Python, C++, MATLAB

Experience

TGS Salt Identification Challenge

https://arxiv.org/abs/1812.01429
Competition and Publication (2018)
Rank: 27/3234 (top 1%)

The aim of the challenge was to identify salt deposits using reflection seismology data. Several novel deep learning techniques were merged in the final solution. The problem review and proposed solution are discussed in the published article.

N+1 fish, N+2 fish

https://www.drivendata.org/competitions/48/identify-fish-challenge/
Competition (2017)
Rank: 11/463 (top 3%)

The competition required to detect fish in the provided video feed, classify types of detected fish and measure some characteristics as well. The difficulty of the problem resulted from a large amount of data, confusing fish types and requirement to detect every single fish once.

The solution consisted of two steps: firstly, the ROI of fish appearance for each video was detected—this allowed the reduction of the computational complexity dramatically. Then the R-CNN models were applied.

2018 Data Science Bowl

https://www.kaggle.com/c/data-science-bowl-2018
Competition (2018)
Rank: 130/3634 (top 4%)

In this segmentation problem, participants had to identify the cells' nuclei. Two approaches to the problem were considered: use of U-Net models with pre-trained encoders (TernausNet-like architecture) and implementation of Mask R-CNN models.

Toxic Comment Classification Challenge

https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge
Competition (2017-2018)
Rank: 406/4551 (top 9%)

The aim of this competition was to classify Wikipedia comments according to the level of toxicity. The main issues were poorly balanced classes, some non-English comments and mislabeled data. We implemented standard NLP models, trained LSTM neural networks, used word embeddings. As the final solution, the stacking of individual models was implemented.

Helical Modes in Low- and High-swirl Jets Measured by Tomographic PIV

Publication (2016)
Dmitriy M. Markovich, Vladimir M. Dulin, Sergey S. Abdurakipov, Leonid A. Kozinkin, Mikhail P. Tokarev, Kemal Hanjalić

This is a report on a parallel study on properties of large-scale vortical structures in low- and high-swirl turbulent jets by means of the time-resolved tomographic particle image velocimetry technique.

Methods for Chaotic Dynamics in Studies of Synchrony in Complex Natural Systems

Publication (2017)
A.N. Bondarenko, M.A. Bondarenko, T.V. Bugueva, L.A. Kozinkin

The wavelet-transform-modulus-maxima (WTMM) method was applied to study pairwise synchrony of irregular fluctuations in insect population size in several localities throughout the United Kingdom.

Education

2015 - 2017

Master's Degree in Computer Science

Yandex School of Data Analysis - Novosibirsk, Russia

2009 - 2015

Master's Degree in Mathematics

Novosibirsk State University - Novosibirsk, Russia

Skills

Libraries/APIs

Scikit-learn, Matplotlib, Pandas, PyTorch, OpenCV, Intel MKL, MPI, Natural Language Toolkit (NLTK), Bottle.py, TensorFlow, Keras, XGBoost, Beautiful Soup

Tools

PyCharm, Git, MATLAB, Microsoft Visual Studio, Scikit-image, Subversion (SVN)

Languages

Python, SQL, C++, Java, JavaScript

Platforms

Jupyter Notebook, Linux, Amazon Web Services (AWS), Windows

Paradigms

Parallel Programming, Concurrent Programming, Object-oriented Programming (OOP)

Frameworks

Flask, LightGBM, Django

Storage

Redshift, PostgreSQL, MySQL, SQLite

Other

Machine Learning, Data Science, Image Processing, Deep Learning, Neural Networks, Algorithms, Mathematics, Statistics, Computer Vision, Back-end, Natural Language Processing (NLP), Big Data, Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring