
Leon Kozinkin
Verified Expert in Engineering
Machine Learning Developer
Novosibirsk, Russia
Toptal member since February 1, 2019
Leon is a skilled specialist with more than five years of scientific software development experience, strong mathematical background, and knowledge of fundamental CS algorithms. He's passionate about data science, deep learning, image processing, NLP, and big data. He's also capable of developing structured production-ready solutions from scratch. As a data science expert, Leon is ranked in the top 1% of competitors on Kaggle.
Portfolio
Experience
- Python - 5 years
- Image Processing - 4 years
- Pandas - 4 years
- Machine Learning - 4 years
- Data Science - 4 years
- Scikit-learn - 4 years
- Generative Pre-trained Transformers (GPT) - 3 years
- Deep Learning - 3 years
Availability
Preferred Environment
PyCharm, Jupyter Notebook, Python, Git, Linux
The most amazing...
...solution I've worked on was a salt deposit identification neural network model which achieved 27th place out of 3,234 on Kaggle.
Work Experience
Data Scientist
Freelance Work
- Developed computer vision real-time algorithms to track and count customers via surveillance cameras.
- Implemented back-end RESTful web services utilizing machine learning pipelines.
- Created deep learning models for image segmentation and classification problems.
- Designed SQL database architectures and created and provided ORMs to machine learning pipelines.
- Implemented Warp-CTC neural networks and applied to a real-time speech recognition problem.
Research Assistant
Sobolev Institute of Mathematics
- Researched the charged particles unit-sphere self-organization problem and its applications.
- Studied synchronization processes in chaotic dynamic systems.
- Created statistical, machine learning models and numerical simulations of given phenomena.
- Optimized and scaled developed algorithms.
- Presented results at conferences and wrote articles and reports.
Research Engineer
Sigma-Pro
- Researched and developed high-performance 3D tomographic particle image velocimetry algorithms.
- Implemented the algorithms into the company scientific framework.
- Processed experimental data and analyzed results.
- Published novel results in scientific journals.
- Supported, improved, and optimized the existing algorithmic framework.
Experience
TGS Salt Identification Challenge
https://arxiv.org/abs/1812.01429Rank: 27/3234 (top 1%)
The aim of the challenge was to identify salt deposits using reflection seismology data. Several novel deep learning techniques were merged in the final solution. The problem review and proposed solution are discussed in the published article.
N+1 fish, N+2 fish
https://www.drivendata.org/competitions/48/identify-fish-challenge/Rank: 11/463 (top 3%)
The competition required to detect fish in the provided video feed, classify types of detected fish and measure some characteristics as well. The difficulty of the problem resulted from a large amount of data, confusing fish types and requirement to detect every single fish once.
The solution consisted of two steps: firstly, the ROI of fish appearance for each video was detected—this allowed the reduction of the computational complexity dramatically. Then the R-CNN models were applied.
2018 Data Science Bowl
https://www.kaggle.com/c/data-science-bowl-2018Rank: 130/3634 (top 4%)
In this segmentation problem, participants had to identify the cells' nuclei. Two approaches to the problem were considered: use of U-Net models with pre-trained encoders (TernausNet-like architecture) and implementation of Mask R-CNN models.
Toxic Comment Classification Challenge
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challengeRank: 406/4551 (top 9%)
The aim of this competition was to classify Wikipedia comments according to the level of toxicity. The main issues were poorly balanced classes, some non-English comments and mislabeled data. We implemented standard NLP models, trained LSTM neural networks, used word embeddings. As the final solution, the stacking of individual models was implemented.
Helical Modes in Low- and High-swirl Jets Measured by Tomographic PIV
Dmitriy M. Markovich, Vladimir M. Dulin, Sergey S. Abdurakipov, Leonid A. Kozinkin, Mikhail P. Tokarev, Kemal Hanjalić
This is a report on a parallel study on properties of large-scale vortical structures in low- and high-swirl turbulent jets by means of the time-resolved tomographic particle image velocimetry technique.
Methods for Chaotic Dynamics in Studies of Synchrony in Complex Natural Systems
A.N. Bondarenko, M.A. Bondarenko, T.V. Bugueva, L.A. Kozinkin
The wavelet-transform-modulus-maxima (WTMM) method was applied to study pairwise synchrony of irregular fluctuations in insect population size in several localities throughout the United Kingdom.
Education
Master's Degree in Computer Science
Yandex School of Data Analysis - Novosibirsk, Russia
Master's Degree in Mathematics
Novosibirsk State University - Novosibirsk, Russia
Skills
Libraries/APIs
Scikit-learn, Matplotlib, Pandas, PyTorch, OpenCV, Intel MKL, MPI, Natural Language Toolkit (NLTK), Bottle.py, TensorFlow, Keras, XGBoost, Beautiful Soup
Tools
PyCharm, Git, MATLAB, Microsoft Visual Studio, Scikit-image, Subversion (SVN)
Languages
Python, SQL, C++, Java, JavaScript
Platforms
Jupyter Notebook, Linux, Amazon Web Services (AWS), Windows
Paradigms
Parallel Programming, Concurrent Programming, Object-oriented Programming (OOP)
Frameworks
Flask, LightGBM, Django
Storage
Redshift, PostgreSQL, MySQL, SQLite
Other
Machine Learning, Data Science, Image Processing, Deep Learning, Neural Networks, Algorithms, Mathematics, Statistics, Computer Vision, Back-end, Natural Language Processing (NLP), Big Data, Generative Pre-trained Transformers (GPT)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring