Yaroslav Kopotilov, Developer in Tbilisi, Georgia

Yaroslav Kopotilov

Data Scientist and Developer

Location
Tbilisi, Georgia
Toptal Member Since
April 9, 2020

Yaroslav is a full-stack data scientist with experience in business analysis, predictive modeling, data visualization, data orchestration, and deployment. He leverages a wide range of machine learning methods, statistics, and business insights to find just the right solution for a problem. Above everything else, he aims to deliver a project that would be truly useful for his clients.

Yaroslav is available for hire
Hire Yaroslav

Portfolio

Self-employed
Python, SQL, PostgreSQL, Cloud Services, Docker, Ansible, Git, GitHub...
TickUp AB
Algorithms, Python, Statistics, Trading, Financial Markets, Data Mining...
Vitol
ActiveBatch, Kibana, Amazon Athena, Amazon S3 (AWS S3), Git, Oracle SQL, Python...

Experience

Python - 7 yearsTime Series Analysis - 5 yearsStatistics - 5 yearsMachine Learning - 5 yearsData Engineering - 4 yearsSQL - 4 yearsData Visualization - 3 yearsStakeholder Engagement - 3 years

Location

Tbilisi, Georgia

Availability

Part-time

Preferred Environment

Git, Jupyter, PyCharm, MacOS, Linux, Visual Studio Code (VS Code)

The most amazing...

...thing I've developed is an algorithmic trading strategy powered by multiple data pipelines and one ML model running 24/7.

Work Experience

2021 - PRESENT

Algorithmic Trading — Principal Researcher and Developer

Self-employed
  • Onboarded dozens of data sources from files, REST APIs, and messaging protocols to a PostgreSQL database. Configured data transformations in the database to create and update features in real time.
  • Configured monitoring and alerting systems for data injection using Python and Grafana.
  • Analyzed the price and industry data to generate a signal for a high-frequency algorithmic trading strategy.
  • Optimized the strategy to maximize P&L while keeping the default risk minimal. Analyzed the L2 price data to estimate the market impact.
  • Managed a team of three developers and handled the overall project management.
Technologies: Python, SQL, PostgreSQL, Cloud Services, Docker, Ansible, Git, GitHub, Machine Learning, Data Science, Time Series Analysis, Remote Team Leadership, Technical Hiring, STOMP, Jupyter Notebook, Code Review, IT Project Management, Team Leadership, Quantitative Risk Analysis, Grafana, Data Engineering, Data Analysis, Financial Data, Regression, Statistical Analysis, Algorithms, APIs, Forecasting, Data Analytics
2020 - 2021

Developer and Analyst for a Quantitative Research Project

TickUp AB
  • Analyzed and unified multiple datasets for US equity markets.
  • Developed an ML model and several data pipelines of an algorithmic trading strategy.
  • Wrote and reviewed both research notebooks and production code.
  • Organized a seven-day company meetup, which helped boost team productivity and collaboration.
Technologies: Algorithms, Python, Statistics, Trading, Financial Markets, Data Mining, Algorithmic Trading, Time Series Analysis, Equity Market Data, Docker, Jupyter Notebook, Data Visualization, Financial Data, Code Review, SQL, Git, GitHub, Data Analysis, Regression, Statistical Analysis, Data Science, Forecasting, Data Analytics
2019 - 2020

Energy Trading — Data Scientist

Vitol
  • Created market analysis tools and systematic strategies for coal, power, and crude desks. Covered all phases of a data science project, including project setup, data pipelines, modeling, and deployment.
  • Analyzed the firm-wide trading market impact under different execution styles.
  • Worked with both small (50 data points) and large (several terabytes) datasets.
  • Contributed individually and in collaboration with the data science and IT teams.
  • Assisted Vitol's employees in Python and machine learning training.
Technologies: ActiveBatch, Kibana, Amazon Athena, Amazon S3 (AWS S3), Git, Oracle SQL, Python, Time Series Analysis, Machine Learning, Data Science, Software Development, Data Engineering, Jupyter Notebook, Pandas, Algorithmic Trading, Data Visualization, Bitbucket, Dashboards, Amazon Web Services (AWS), Dash, Web Dashboards, Big Data, Data Analysis, Financial Data, Regression, Statistical Analysis, Forecasting, Data Analytics
2017 - 2018

Model Validation, Commodities — Associate

JPMorgan
  • Implemented from scratch a custom version of the extended Kalman filter to calibrate exotic option pricing models that outperformed the existing calibration methods.
  • Reviewed ten pricing models' options and their implementations in commodities and credit.
  • Measured and mitigated numerous model risks in collaboration with the desk and developers.
  • Mentored junior employees during their review work.
Technologies: Python, Derivative Pricing, Stochastic Modeling, Time Series Analysis, Machine Learning, Quantitative Analysis, Quantitative Modeling, Quantitative Finance, Quantitative Risk Analysis, Data Analysis, Financial Data, Forecasting, Data Analytics
2016 - 2016

Algorithmic Trading — Intern

Credit Suisse
  • Designed and implemented two mid-frequency trading strategies for the commodity desk.
  • Analyzed portfolio hedging strategies using risk factors for the equity desk.
  • Implemented a data pipeline that cleaned and transformed tabular data for the equity desk.
Technologies: MATLAB, R, SQL, Python, Machine Learning, Time Series Analysis, Data Analysis, Financial Data, Regression, Statistical Analysis, Data Science, Forecasting, Data Analytics
2015 - 2015

Research—Intern

Novosibirsk State University
  • Wrote a research paper describing a metric that uses Fourier descriptors to compare shapes with internal gaps.
  • Implemented a classification algorithm that achieved 98% accuracy on a dataset with 19 classes of images.
  • Presented the results at the scientific conference MNSK 2015, Novosibirsk.
Technologies: OpenCV, Python, Computer Vision, Mathematics, Machine Learning, Jupyter Notebook, Data Analysis

Experience

Interactive Website

https://datascienceforhire.net/
This is a simple personal website powered by Flask and Dash. It is run in a Docker container and has monitoring systems tracking web activity and errors. While I'm not specialized in web development, the ability to create a simple web interface to visualize data or machine learning model predictions can be very handy.

Yet another XML Parser

https://github.com/mysterious-ben/xmlrecords
This is a simple yet efficient Python package to parse XML. The package is written specifically for the fast extraction of tabular data (unlike xmltodict, which handles XML of any structure but slower). XML is not the most data science-friendly format, so the ability to transform it to Pandas or SQL can be very handy.

Top 1 in Time Series Forecast Competition on Kaggle

https://www.kaggle.com/myster/eda-prophet-winning-solution-3-0
In 2018, before I started to work as a data scientist, I was studying textbooks on machine learning and testing the newly learned methods in various mini-projects. That's when I found this competition about predicting store sales on Kaggle. Time series is one of my favorite subjects, so I jumped in.
It was very fun to explore and visualize the dataset, to find interesting quirks in it. In particular, soon it became clear that this data had been synthetically generated, which gave out an important clue on how to solve this problem. And it was very exciting that in the end, my analysis paid off and I scored the first place!
Also, I was working on this project with my ex-colleague, so it was a good collaborative experience with just a touch of project management. Of course, it was far from the complexity of managing a real data science project—still, it gave me at least some sense of what might be waiting ahead.

Data Pipelining Tools

https://github.com/mysterious-ben/apipe
An open-source Python package to create data pipelines based on the Dask package. It features:
• Lazy computation and cache loading
• Pickle and parquet serialization
• Support for hashing of NumPy arrays and pandas DataFrames
Image of Embeddings in Machine Learning: Making Complex Data Simple publication
Publication

Embeddings in Machine Learning: Making Complex Data Simple

https://www.toptal.com/machine-learning/embeddings-in-machine-learning

Skills

Languages

Python, SQL, R, C++, Java, HTML, CSS, XML

Libraries/APIs

Scikit-learn, Pandas, Matplotlib, OpenCV, REST APIs, SQLAlchemy, SciPy, Python Asyncio, Dask, PyTorch, TensorFlow

Tools

Jupyter, Git, StatsModels, PyCharm, Amazon Athena, ActiveBatch, MATLAB, Kibana, Plotly, Boto 3, Ansible, GitHub, Bitbucket, Grafana

Paradigms

Data Science, Object-oriented Programming (OOP), Agile Software Development, Functional Analysis, STOMP

Other

Predictive Modeling, Forecasting, Data Analysis, Predictive Analytics, Statistics, Machine Learning, Supervised Learning, Regression, Data Analytics, Time Series, Artificial Intelligence (AI), Time Series Analysis, Mathematics, Data Visualization, Stakeholder Engagement, Data Engineering, Option Pricing, Unsupervised Learning, Finance, Financial Data, Quantitative Analysis, Quantitative Risk Analysis, Statistical Analysis, Web Dashboards, Machine Learning Operations (MLOps), Code Deployment, Algorithms, Futures & Options, Energy, Systematic Trading, Deep Learning, Probability Theory, Mathematical Analysis, Applied Mathematics, Derivative Pricing, Chemistry, Stochastic Modeling, Stochastic Differential Equations, Econometrics, Economics, Computer Vision, Software Development, Genetic Algorithms, Dash, Trading, Financial Markets, Data Mining, Algorithmic Trading, Equity Market Data, Cloud Services, Remote Team Leadership, Technical Hiring, Code Review, IT Project Management, Team Leadership, Dashboards, Quantitative Modeling, Quantitative Finance, Big Data, APIs

Frameworks

LightGBM, Spark, Flask

Platforms

Jupyter Notebook, Docker, Linux, MacOS, Amazon Web Services (AWS), Visual Studio Code (VS Code)

Storage

Oracle SQL, Amazon S3 (AWS S3), SQLite, PostgreSQL

Industry Expertise

Project Management

Education

2015 - 2016

Master's Degree in Financial Mathematics

Université Pierre et Marie Curie - Paris, France

2012 - 2016

Master's Degree in Applied Mathematics

École Polytechnique - Paris, France

2012 - 2015

Master's Degree in Mathematics and Computer Science

Novosibirsk State University - Novosibirsk, Russia

2008 - 2012

Bachelor's Degree in Probability and Statistics

Novosibirsk State University - Novosibirsk, Russia