Sergei Markochev, Developer in London, United Kingdom
Sergei is available for hire
Hire Sergei

Sergei Markochev

Verified Expert  in Engineering

Data Scientist and Developer

Location
London, United Kingdom
Toptal Member Since
May 24, 2021

Sergei is a lead data scientist with over 15 years of extensive experience. He has experience in applied modeling and developing highly complex enterprise products. He also has successfully managed data science teams and contributed as a senior data scientist and solutions architect. Sergei has published six academic papers and one international patent in his field and was recently the winner of a data science competition.

Portfolio

CI&T
Amplitude, LaunchDarkly, mParticle, Jupyter Notebook, Snowflake, Data Analytics...
Tellusant
Python, Mathematics, Algorithms, Azure, PostgreSQL, ARIMA
Kainos
Python, Python 3, Azure, Azure SQL, Jupyter Notebook, Machine Learning...

Experience

Availability

Part-time

Preferred Environment

Jupyter Notebook, Windows, Linux, Git, Python, Amazon Web Services (AWS)

The most amazing...

...algorithm that I've developed was ranked #1 at an aircraft localization data science competition hosted by AIcrowd.

Work Experience

Data Science and Analytics Manager

2022 - PRESENT
CI&T
  • Led data analytics projects with a CI&T international client.
  • Enabled A/B testing on the client side by fixing code logic and deep-diving tools.
  • Investigated data quality and anomalies using SQL and Snowflake.
  • Applied Amazon Personalize as a custom recommendation system for the client.
Technologies: Amplitude, LaunchDarkly, mParticle, Jupyter Notebook, Snowflake, Data Analytics, A/B Testing, SQL, Jira, Management, AWS SAM, Recommendation Systems, AWS Step Functions, Revenue Management, Pricing, Amazon Web Services (AWS), Big Data, Data Scraping

Software Developer

2021 - 2023
Tellusant
  • Improved data quality and filled data gaps using machine learning and custom modeling.
  • Reviewed code and helped to build an MVP. Implemented time-series prediction.
  • Investigated opportunities to predict audiences for specific products through analysis of global data.
Technologies: Python, Mathematics, Algorithms, Azure, PostgreSQL, ARIMA

Senior Data Scientist

2022 - 2022
Kainos
  • Developed a system for extraction, manipulation, and search of helpful information from employees' resumes. Used natural language processing (NLP) techniques.
  • Led data investigation and prototype models development for the client (a construction company).
  • Presented some advanced topics on application deployment on AWS for an internal deep dive session.
Technologies: Python, Python 3, Azure, Azure SQL, Jupyter Notebook, Machine Learning, Statistics, Dash, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Software, SpaCy, Amazon Web Services (AWS), Data Scraping

Machine Learning (ML) Engineer

2021 - 2022
Bowen & Associates Ltd.
  • Developed a state-of-the-art ML model to predict commercial property prices.
  • Deployed the ML model on AWS to test its predictions.
  • Advised the client on advances and limitations of the model, data quality, and deployment for testing.
Technologies: Machine Learning, Classification Algorithms, Regression Modeling

Lead Data Scientist

2018 - 2022
GroupM
  • Productionized three apps related to the investigation and optimization of global TV ad schedules.
  • Developed a cross-media data fusion model with an external deduplication data set.
  • Predicted digital behavior for target audiences defined by TV show viewership and vice versa using ML techniques.
  • Created Looker dashboards to present POCs and data insights.
  • Developed deep learning models of reach curves for individual TV channels and other combinations.
  • Carried out a bespoke analysis for multibillion-dollar stakeholders.
  • Communicated results to stakeholders and product managers. Managed and hired data scientists.
Technologies: Machine Learning, Data Analysis, SQL, Data Cleaning, Cython, Software Development, Agile, R, Looker, Deep Learning, Nonlinear Optimization, Clustering, Git, Pandas, Keras, Artificial Intelligence (AI), Data Science, Quantitative Research, Amazon Web Services (AWS), Data Visualization, ETL, MySQL, Python, Data Analytics, Scikit-learn, Dashboard Development, Time Series, NumPy, SciPy, Jupyter Notebook, Statistics, Unsupervised Learning, Big Data

Data Scientist (Python)

2021 - 2021
Applied AI LLC
  • Developed an ML model to classify the content of industry-specific PDF documents.
  • Investigated different approaches (ML, NLP, statistical) to modelling of document content.
  • Assisted the client on best practices and models during the project.
Technologies: Python, PDF Scraping, Data Science, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Data Analytics, Machine Learning, Deep Learning

Battery Analytics Scientist

2015 - 2018
BBOXX LTD
  • Invented and deployed a patented state-of-the-art algorithm for remote capacity estimation of lead-acid batteries by their telemetry.
  • Produced insights on battery performance and customer usage patterns to reduce battery failure maintenance.
  • Developed advanced alerting and anomaly detection systems to monitor over 100,000 solar panels’ performance (broken sensors, tampering, heavy usage, and so on).
  • Developed a Bayesian survival model for the prediction of battery failure rate in the future.
Technologies: Digital Signal Processing, Data Analysis, Machine Learning, Nonlinear Optimization, Data Cleaning, Cython, Software Development, Agile, Linux, Monte Carlo Simulations, SQL, Bayesian Inference & Modeling, Clustering, Git, Pandas, Object-oriented Programming (OOP), Mathematics, Data Science, Amazon Web Services (AWS), Data Visualization, MySQL, Python, Data Analytics, Scikit-learn, Dashboard Development, Time Series, NumPy, SciPy, PostgreSQL, Jupyter Notebook, Unsupervised Learning, Big Data

Assistant

2009 - 2014
Moscow Institute of Physics and Technology
  • Supported and organized the educational process, conducted courses, and supervised bachelor degree routes.
  • Organized and provided the department’s section at the annual university conference.
  • Led laboratory courses and seminars on atomic physics and optics.
Technologies: University Teaching, LaTeX, Applied Physics

Senior Research Associate

2007 - 2014
Central Institute of Chemistry and Mechanics
  • Led the experimental research on rare nuclear decays (published in five academic papers and reported on in four international conferences).
  • Developed a fully automated digital spectroscopic system for the investigation of rare nuclear decays (Ph.D. thesis).
  • Carried out data analyses and Monte Carlo simulations.
Technologies: Data Analysis, Monte Carlo Simulations, University Teaching, C++, Digital Signal Processing, Software Development, Data Cleaning, Applied Mathematics, Applied Physics, MATLAB, Object-oriented Programming (OOP), Mathematics, Statistics

Aircraft Localization Competition

https://github.com/smarkochev/Aircraft_localization_competition_round_2
In this competition, participants determine the aircraft positions based on time of arrival and signal strength measurements reported by many low-cost crowdsourced sensors. Only some receivers provide GPS-synchronized timestamps while others experience strong clock drifts or provide fully broken timestamps.

The competition was organized by the Swiss Cyber-Defence Campus of Armasuisse Science and Technology, the data was collected by the OpenSky Network, a large-scale ADS-B sensor network for research.

• https://www.aicrowd.com/challenges/cyd-campus-aircraft-localization-competition/leaderboards

Prediction of Customer Spending

https://github.com/smarkochev/ds_notebooks/
A data analysis of customer purchase history and prediction on their total spending in the future using Bayesian modelling and Monte Carlo simulation.

Notebook:
• Prediction of customer spending.ipynb

Expedia Hotel Sales | Kaggle Competition

https://www.kaggle.com/c/hotelsales/
A Kaggle indoor competition aimed at predicting hotel sales for the first ten days for a subset of new Expedia hotels (for which Expedia has no historic data).

I was ranked #1 among 19 teams proposing a combination of machine learning models.

Rail-ticket Price Prediction

https://github.com/smarkochev/ds_notebooks
Ticket prices change based on demand and time, and there can be a significant difference in price. In these two notebooks, I investigated the possibility of developing a pricing monitoring system for Spanish high-speed trains using data from Kaggle datasets.

Notebooks:
• Rail_ticket_price_prediction_IDE.ipynb
• Rail_ticket_price_prediction_modelling.ipynb

Statoil Kaggle Competition

https://github.com/smarkochev/ds_notebooks
Drifting icebergs present threats to navigation and activities in areas such as offshore of the East Coast of Canada. In this competition, I was challenged to build an algorithm that automatically identifies if a remotely sensed target is a ship or iceberg.

Notebooks:
• Statoil_Kaggle_competition_main.ipynb
• Statoil_Kaggle_competition_google_colab_notebook.ipynb
• Statoil_Kaggle_competition_DL_comparison.ipynb

Languages

SQL, Python, Octave, R, C++, Python 3, Snowflake

Libraries/APIs

Pandas, Scikit-learn, NumPy, SciPy, Keras, PyMC, PySpark, SpaCy

Paradigms

Data Science, Quantitative Research, Agile, Object-oriented Programming (OOP), ETL, Management

Platforms

Jupyter Notebook, Amazon Web Services (AWS), Linux, Azure

Storage

MySQL, PostgreSQL, Azure SQL

Other

Applied Mathematics, Data Analysis, Digital Signal Processing, Machine Learning, Data Cleaning, Nonlinear Optimization, University Teaching, Software Development, Clustering, Applied Physics, Mathematics, Scientific Data Analysis, Data Analytics, Data Visualization, Artificial Intelligence (AI), Big Data, Monte Carlo Simulations, Deep Learning, Cython, Bayesian Inference & Modeling, Time Series, Predictive Modeling, Dashboard Development, Computer Vision, Multithreading, Unsupervised Learning, Statistics, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Recommendation Systems, Data Scraping, PDF Scraping, Dash, Software, Amplitude, mParticle, A/B Testing, Classification Algorithms, Regression Modeling, Algorithms, ARIMA, AWS SAM, Revenue Management, Pricing

Tools

MATLAB, Looker, Git, LaTeX, AWS Step Functions, LaunchDarkly, Jira

Frameworks

Spark

2008 - 2013

Ph.D. in Nuclear Physics

Moscow Institute of Physics and Technology - Moscow, Russia

2006 - 2008

Master's Degree in Applied Mathematics and Physics

Moscow Institute of Physics and Technology - Moscow, Russia

2002 - 2006

Bachelor's Degree in Applied Mathematics and Physics

Moscow Institute of Physics and Technology - Moscow, Russia

OCTOBER 2019 - PRESENT

Probabilistic Graphical Models Specialization

Stanford University | via Coursera

JULY 2019 - PRESENT

Advanced Data Science with IBM Specialization

IBM | via Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring