Yaroslav Kopotilov, Data Scientist and Developer in Moscow, Russia
Yaroslav Kopotilov

Data Scientist and Developer in Moscow, Russia

Member since February 16, 2020
Yaroslav is a full-stack data scientist with experience in business analysis, predictive modeling, data visualization, data orchestration, and deployment. He leverages a wide range of machine learning methods, statistics, and business insights to find just the right solution for a problem. Above everything else, he aims to deliver a project that would be truly useful for his client.
Yaroslav is now available for hire

Portfolio

Experience

Location

Moscow, Russia

Availability

Part-time

Preferred Environment

Git, Jupyter, PyCharm, MacOS, Linux

The most amazing...

...thing I've developed is an algorithmic trading strategy powered by multiple data pipelines and one ML model running 24/7.

Employment

  • Data Scientist

    2019 - 2020
    Vitol
    • Created market analysis tools and systematic strategies for coal, power, and crude desks. Covered all phases of a data science project, including project set up, data pipelines, modeling, and deployment.
    • Worked with small—50 samples and big—several terabytes—of tabular data.
    • Contributed individually and in collaboration with the data science and IT team.
    • Assisted Vitol’s employees in Python and machine learning training.
    Technologies: ActiveBatch, Kibana, AWS Athena, AWS S3, Git, Oracle SQL, Python
  • Model Validation, Commodities — Associate

    2017 - 2018
    JPMorgan
    • Implemented from scratch a custom version of the extended Kalman filter to calibrate exotic option pricing models that outperformed the existing calibration methods.
    • Reviewed ten pricing models' options and their implementations in commodities and credit.
    • Measured and mitigated numerous model risks in collaboration with the desk and developers.
    • Mentored junior employees during their review work.
    Technologies: Python
  • Algorithmic Trading — Intern

    2016 - 2016
    Credit Suisse
    • Designed and implemented two mid-frequency trading strategies for the commodity desk.
    • Analyzed portfolio hedging strategies using risk factors for the equity desk.
    • Implemented a data pipeline that cleaned and transformed tabular data for the equity desk.
    Technologies: MATLAB, R, SQL, Python
  • Research — Intern

    2015 - 2015
    Novosibirsk State University
    • Wrote a research paper describing a special metric for images with multiple shapes using Fourier descriptors.
    • Implemented a classification algorithm that achieved 98% accuracy on a dataset with 19 classes of images.
    • Presented the results at the scientific conference MNSK 2015, Novosibirsk.
    Technologies: OpenCV, Python

Experience

  • Interactive Website
    https://datascienceforhire.net/

    This is a simple personal website powered by Flask and Dash. It is run in a Docker container and has monitoring systems tracking web activity and errors. While I'm not specialized in web development, the ability to create a simple web interface to visualize data or Machine Learning model's predictions can be very handy.

  • Yet another XML parser
    https://github.com/mysterious-ben/xmlrecords

    This is a simple yet efficient Python package to parse XML. The package is written specifically for the fast extraction of tabular data (unlike xmltodict, which handles XML of any structure, but slower). XML is not the most Data Science friendly format, so the ability to transform it to Pandas or SQL can be very handy.

  • Top 1 in Time Series Forecast Competition on Kaggle
    https://www.kaggle.com/myster/eda-prophet-winning-solution-3-0

    In 2018, before I started to work as a data scientist, I was studying textbooks on machine learning and testing the newly learned methods in various mini-projects. That's when I found this competition about predicting store sales on Kaggle. Time series is one of my favorite subjects, so I jumped in.
    It was very fun to explore and visualize the dataset, to find interesting quirks in it. In particular, soon it became clear that this data had been synthetically generated, which gave out an important clue on how to solve this problem. And it was very exciting that in the end, my analysis paid off and I scored the first place!
    Also, I was working on this project with my ex-colleague, so it was a good collaborative experience with just a touch of project management. Of course, it was far from the complexity of managing a real data science project—still, it gave me at least some sense of what might be waiting ahead.

  • Embeddings in Machine Learning: Making Complex Data Simple (Publication)
    Working with non-numerical data can be challenging, even for seasoned data scientists. To make good use of such data, it needs to be transformed. But how? In this article, Toptal Data Scientist Yaroslav Kopotilov will introduce you to embeddings and demonstrate how they can be used to visualize complex data and make it usable.

Skills

  • Languages

    Python, SQL, R, C++, Java
  • Libraries/APIs

    Scikit-learn, Pandas, Matplotlib, OpenCV, REST APIs, SQLAlchemy, SciPy, Python Asyncio, Dask
  • Paradigms

    Data Science, Object-oriented Programming (OOP), Agile Software Development
  • Other

    Predictive Modeling, Forecasting, Artificial Intelligence (AI), Data Analysis, Predictive Analytics, Statistics, Machine Learning, Supervised Learning, Time Series, Time Series Analysis, Mathematics, Data Visualization, Stakeholder Engagement, Data Engineering, Option Pricing, Unsupervised Learning, Finance, Web Dashboards, Machine Learning Operations (MLOps), Code Deployment, Algorithms, Futures & Options, Energy, Systematic Trading, Deep Learning
  • Frameworks

    LightGBM, Spark, Flask
  • Tools

    StatsModels, PyCharm, Git, Jupyter, AWS Athena, ActiveBatch, MATLAB, Kibana, Plotly, Boto 3
  • Platforms

    Jupyter Notebook, Docker, Linux, MacOS, Amazon Web Services (AWS)
  • Storage

    Oracle SQL, AWS S3, SQLite
  • Industry Expertise

    Project Management

Education

  • Master's degree in Financial Mathematics
    2015 - 2016
    Université Pierre et Marie Curie - Paris, France
  • Master's degree in Applied Mathematics
    2012 - 2016
    École Polytechnique - Paris, France
  • Master's degree in Mathematics and Computer Science
    2012 - 2015
    Novosibirsk State University - Novosibirsk, Russia
  • Bachelor's degree in Probability and Statistics
    2008 - 2012
    Novosibirsk State University - Novosibirsk, Russia

To view more profiles

Join Toptal
Share it with others