Sergei Markochev, Developer in London, United Kingdom
Sergei is available for hire
Hire Sergei

Sergei Markochev

Verified Expert  in Engineering

Data Scientist and Developer

London, United Kingdom
Toptal Member Since
May 24, 2021

Sergei is a lead data scientist with over 15 years of extensive experience. He has experience in applied modeling and developing highly complex enterprise products. He also has successfully managed data science teams and contributed as a senior data scientist and solutions architect. Sergei has published six academic papers and one international patent in his field and was recently the winner of a data science competition.


Amplitude, LaunchDarkly, mParticle, Jupyter Notebook, Snowflake, Data Analytics...
Python, Python 3, Azure, Azure SQL, Jupyter Notebook, Machine Learning...
Bowen & Associates Ltd.
Machine Learning, Classification Algorithms, Regression Modeling




Preferred Environment

Jupyter Notebook, Windows, Linux, Git, Python, Amazon Web Services (AWS)

The most amazing...

...algorithm that I've developed was ranked #1 at an aircraft localization data science competition hosted by AIcrowd.

Work Experience

Data Science and Analytics Manager

2022 - PRESENT
  • Led data analytics projects with a CI&T international client.
  • Enabled A/B testing on the client side by fixing code logic and deep-diving tools.
  • Investigated data quality and anomalies using SQL and Snowflake.
  • Applied AWS Personalize as a custom recommendation system for the client.
Technologies: Amplitude, LaunchDarkly, mParticle, Jupyter Notebook, Snowflake, Data Analytics, A/B Testing, SQL, Jira, Management

Senior Data Scientist

2022 - 2022
  • Developed a system for extraction, manipulation, and search of helpful information from employees' resumes. Used natural language processing (NLP) techniques.
  • Led data investigation and prototype models development for the client (a construction company).
  • Presented some advanced topics on application deployment on AWS for an internal deep dive session.
Technologies: Python, Python 3, Azure, Azure SQL, Jupyter Notebook, Machine Learning, Statistics, Dash, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Software, SpaCy

Machine Learning (ML) Engineer

2021 - 2022
Bowen & Associates Ltd.
  • Developed a state-of-the-art ML model to predict commercial property prices.
  • Deployed the ML model on AWS to test its predictions.
  • Advised the client on advances and limitations of the model, data quality, and deployment for testing.
Technologies: Machine Learning, Classification Algorithms, Regression Modeling

Lead Data Scientist

2018 - 2022
  • Productionized three apps related to the investigation and optimization of global TV ad schedules.
  • Developed a cross-media data fusion model with an external deduplication data set.
  • Predicted digital behavior for target audiences defined by TV show viewership and vice versa using ML techniques.
  • Created Looker dashboards to present POCs and data insights.
  • Developed deep learning models of reach curves for individual TV channels and other combinations.
  • Carried out a bespoke analysis for multibillion-dollar stakeholders.
  • Communicated results to stakeholders and product managers. Managed and hired data scientists.
Technologies: Machine Learning, Data Analysis, SQL, Data Cleaning, Cython, Software Development, Agile, R, Looker, Deep Learning, Nonlinear Optimization, Clustering, Git, Pandas, Keras, Artificial Intelligence (AI), Data Science, Quantitative Research, Amazon Web Services (AWS), Data Visualization, ETL, MySQL, Python, Data Analytics, Scikit-learn, Dashboard Development, Time Series, NumPy, SciPy, Jupyter Notebook, Statistics, Unsupervised Learning

Data Scientist (Python)

2021 - 2021
Applied AI LLC
  • Developed an ML model to classify the content of industry-specific PDF documents.
  • Investigated different approaches (ML, NLP, statistical) to modelling of document content.
  • Assisted the client on best practices and models during the project.
Technologies: Python, PDF Scraping, Data Science, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Data Analytics, Machine Learning, Deep Learning

Battery Analytics Scientist

2015 - 2018
  • Invented and deployed a patented state-of-the-art algorithm for remote capacity estimation of lead-acid batteries by their telemetry.
  • Produced insights on battery performance and customer usage patterns to reduce battery failure maintenance.
  • Developed advanced alerting and anomaly detection systems to monitor over 100,000 solar panels’ performance (broken sensors, tampering, heavy usage, and so on).
  • Developed a Bayesian survival model for the prediction of battery failure rate in the future.
Technologies: Digital Signal Processing, Data Analysis, Machine Learning, Nonlinear Optimization, Data Cleaning, Cython, Software Development, Agile, Linux, Monte Carlo Simulations, SQL, Bayesian Inference & Modeling, Clustering, Git, Pandas, Object-oriented Programming (OOP), Mathematics, Data Science, Amazon Web Services (AWS), Data Visualization, MySQL, Python, Data Analytics, Scikit-learn, Dashboard Development, Time Series, NumPy, SciPy, PostgreSQL, Jupyter Notebook, Unsupervised Learning


2009 - 2014
Moscow Institute of Physics and Technology
  • Supported and organized the educational process, conducted courses, and supervised bachelor degree routes.
  • Organized and provided the department’s section at the annual university conference.
  • Led laboratory courses and seminars on atomic physics and optics.
Technologies: University Teaching, LaTeX, Applied Physics

Senior Research Associate

2007 - 2014
Central Institute of Chemistry and Mechanics
  • Led the experimental research on rare nuclear decays (published in five academic papers and reported on in four international conferences).
  • Developed a fully automated digital spectroscopic system for the investigation of rare nuclear decays (Ph.D. thesis).
  • Carried out data analyses and Monte Carlo simulations.
Technologies: Data Analysis, Monte Carlo Simulations, University Teaching, C++, Digital Signal Processing, Software Development, Data Cleaning, Applied Mathematics, Applied Physics, MATLAB, Object-oriented Programming (OOP), Mathematics, Statistics

Aircraft Localization Competition
In this competition, participants determine the aircraft positions based on time of arrival and signal strength measurements reported by many low-cost crowdsourced sensors. Only some receivers provide GPS-synchronized timestamps while others experience strong clock drifts or provide fully broken timestamps.

The competition was organized by the Swiss Cyber-Defence Campus of Armasuisse Science and Technology, the data was collected by the OpenSky Network, a large-scale ADS-B sensor network for research.


Prediction of Customer Spending
A data analysis of customer purchase history and prediction on their total spending in the future using Bayesian modelling and Monte Carlo simulation.

• Prediction of customer spending.ipynb

Expedia Hotel Sales | Kaggle Competition
A Kaggle indoor competition aimed at predicting hotel sales for the first ten days for a subset of new Expedia hotels (for which Expedia has no historic data).

I was ranked #1 among 19 teams proposing a combination of machine learning models.

Rail-ticket Price Prediction
Ticket prices change based on demand and time, and there can be a significant difference in price. In these two notebooks, I investigated the possibility of developing a pricing monitoring system for Spanish high-speed trains using data from Kaggle datasets.

• Rail_ticket_price_prediction_IDE.ipynb
• Rail_ticket_price_prediction_modelling.ipynb

Statoil Kaggle Competition
Drifting icebergs present threats to navigation and activities in areas such as offshore of the East Coast of Canada. In this competition, I was challenged to build an algorithm that automatically identifies if a remotely sensed target is a ship or iceberg.

• Statoil_Kaggle_competition_main.ipynb
• Statoil_Kaggle_competition_google_colab_notebook.ipynb
• Statoil_Kaggle_competition_DL_comparison.ipynb


SQL, Python, Octave, R, C++, Python 3, Snowflake


Pandas, Scikit-learn, NumPy, SciPy, Keras, PyMC, PySpark, SpaCy


Data Science, Quantitative Research, Agile, Object-oriented Programming (OOP), ETL, Management


Jupyter Notebook, Linux, Amazon Web Services (AWS), Azure


MySQL, PostgreSQL, Azure SQL


Applied Mathematics, Data Analysis, Digital Signal Processing, Machine Learning, Data Cleaning, Nonlinear Optimization, University Teaching, Software Development, Clustering, Applied Physics, Mathematics, Scientific Data Analysis, Data Analytics, Data Visualization, Artificial Intelligence (AI), Monte Carlo Simulations, Deep Learning, Cython, Bayesian Inference & Modeling, Time Series, Predictive Modeling, Dashboard Development, Computer Vision, Multithreading, Unsupervised Learning, Statistics, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), PDF Scraping, Dash, Software, Amplitude, mParticle, A/B Testing, Classification Algorithms, Regression Modeling


MATLAB, Looker, Git, LaTeX, LaunchDarkly, Jira



2008 - 2013

Ph.D. in Nuclear Physics

Moscow Institute of Physics and Technology - Moscow, Russia

2006 - 2008

Master's Degree in Applied Mathematics and Physics

Moscow Institute of Physics and Technology - Moscow, Russia

2002 - 2006

Bachelor's Degree in Applied Mathematics and Physics

Moscow Institute of Physics and Technology - Moscow, Russia


Probabilistic Graphical Models Specialization

Stanford University | via Coursera


Advanced Data Science with IBM Specialization

IBM | via Coursera