Cyril Shcherbin, Developer in Prague, Czech Republic
Cyril is available for hire
Hire Cyril

Cyril Shcherbin

Verified Expert  in Engineering

Data Engineering Developer

Prague, Czech Republic
Toptal Member Since
February 12, 2015

Cyril is a seasoned software engineer with a passion for cloud and distributed computing, data engineering, and natural language processing. He has gained a great deal of expertise in developing solutions in information security and financial technology problem domains. Since 2018, Cyril has been a member of a research team conducting experiments on applying machine learning techniques to tackle network and endpoint security challenges.


Cisco Systems
Dask, Python, Java, Machine Learning, Big Data, Amazon Web Services (AWS)...
SNAFU Records (via Toptal)
Apache Airflow, Apache Spark, Amazon Web Services (AWS), Python...
Private Client (via Toptal)
Amazon Web Services (AWS), Django REST Framework, Celery, Django, React...




Preferred Environment

Amazon Web Services (AWS), IntelliJ IDEA, MacOS

The most amazing...

...thing I've designed is a cloud-based data lake and an auto-scaling system for distributed training and tuning of machine learning models

Work Experience

Research Engineer

2018 - PRESENT
Cisco Systems
  • Designed and developed a data lake and an auto-scaling system for distributed training and tuning of machine learning models.
  • Collaborated on a decision tree model for the static analysis layer of an endpoint security application.
  • Improved batch job scheduling and optimized cloud computing costs.
Technologies: Dask, Python, Java, Machine Learning, Big Data, Amazon Web Services (AWS), Kubernetes, Apache Spark, Apache Airflow, Scala, Apache Flink, SQL, Jira

ML/Data Platform Architect

2020 - 2022
SNAFU Records (via Toptal)
  • Architected the company’s data ingestion, storage, and processing platform.
  • Designed and developed the company’s Artist – Producer matchmaking services.
  • Collaborated on due diligence and investment models.
Technologies: Apache Airflow, Apache Spark, Amazon Web Services (AWS), Python, Machine Learning, Scala, Databricks, Data Engineering, Data Lake Design, Terraform, Datadog

Contract Senior Software Engineer

2017 - 2019
Private Client (via Toptal)
  • Designed and developed the core UI components, back-end services, API gateway, and cloud infrastructure.
  • Implemented the system integrations with third-party cloud image storage (Dropbox and Google Drive).
  • Developed and implemented the system's synchronization, deduplication, and reverse image search services.
  • Designed the subscription model and integrated the system with a payment gateway.
Technologies: Amazon Web Services (AWS), Django REST Framework, Celery, Django, React, JavaScript, Heroku, Python, SQL

Senior Software Engineer

2017 - 2018
Cisco Systems
  • Implemented a GraphQL interface on the back end for the system's new UI.
  • Designed and implemented the project's secret management, monitoring, and centralized logging solutions.
  • Facilitated the project's effort to migrate to AWS.
  • Provisioned and maintained a Kubernetes cluster for testing, staging, and production environments.
Technologies: Amazon Web Services (AWS), Python, Scala, Kubernetes, Terraform, React, JavaScript, PostgreSQL, Java, SQL, Jira

Software Engineer

2015 - 2017
Deutsche Börse Group
  • Created a framework for QA engineers to perform integration and regression testing of the company's clearing (finance) and security services.
  • Analyzed, estimated, and fulfilled functional and nonfunctional requirements for complex financial market features.
  • Maintained the company's clearing (finance) and security services. Duties included, but were not limited to: analysis, bug fixing, refactoring, addressing performance issues, and internal 3L on-call support for financial market operations engineers.
Technologies: Jenkins, Python, PostgreSQL, AMQP, Hibernate, SQL, Jira

Contract Software Engineer

2015 - 2015
Restoration Media (via Toptal)
  • Designed, developed, and maintained a cloud-based, highly scalable, and efficient ETL system to handle reporting event data from the company's targeted email marketing campaigns.
  • Designed and developed various webhooks, parsers, and clients for third-party APIs.
  • Facilitated the company's effort to create a single data warehousing solution.
  • Designed and developed data aggregation tools and dashboards for the company's data analysts and email marketing operations managers.
  • Performed numerous migrations from the company's obsolete databases with ad hoc schema updates and modifications.
Technologies: Amazon Web Services (AWS), ETL, JavaScript, Redis, MySQL, Celery, Django REST Framework, Django, Python, SQL

Lead Contract Software Engineer

2013 - 2015
Cisco Systems (via SoftServe)
  • Maintained a database and a file system storage for a corpus of HTTP transactions.
  • Developed a continuous deployment and monitoring strategy for the system and supervised its implementation process.
  • Contrived and implemented an object model for manipulating and validating various versions of configuration files.
  • Designed and deployed a multiprocessing system for validating HTTP-capturing signatures and testing their performance on multiple engines in all possible configurations.
Technologies: Django REST Framework, Linux, Jenkins, MySQL, Django, Python, SQL, Jira, Amazon Web Services (AWS)

Cisco Cognitive Threat Analytics
I am currently a member of an R&D team responsible for researching cutting-edge ML technologies applied to the network security problem domain. The team also collects and documents global threat intelligence and maintains a cloud-based SIEM and UEBA solution that processes hundreds of terabytes of data daily.

Digital Artists’ Copyright Infringement Protection Platform (via Toptal)

I was hired by a Toptal client to assist in the design and development of a SaaS application, which aimed at protecting digital artists against copyright infringement. My main responsibilities were designing and implementing the cloud infrastructure, back-end services, and pipelines to synchronize the photos and videos from cloud storage providers and scan the internet for their occurrences, allowing the artists to identify infringers and take legal action. I also had to integrate the system with a payment gateway and design the subscription models, REST API, and core UI components.

ETL Pipeline for Email Marketing Campaigns (via Toptal)

A Toptal client approached me with a simple data migration problem. However, it turned out that the client was considering a data warehousing solution and an analytics platform. I ended up building a cloud-based, scalable ETL system for processing reporting event streams from email marketing. My responsibilities later pivoted to developing several aggregation jobs and dashboards for marketing campaign managers. It was the most amazing thing that I have ever single-handedly built for quite some time.

Eurex Clearing’s C7

While working for the world’s third-largest derivatives exchange, I joined a boring (at first sight) enterprise effort to modernize its old and rusty monolithic mainframe system into a service-oriented architecture. I have experienced firsthand all the pleasantries of taking apart an archaic codebase and building microservices while retaining the German quality of a banking application with no tolerance for the slightest mistake. I have learned a great deal about finance, quality assurance, and on-call support.

Cisco Application Visibility and Control (AVC)
I have led a team of software developers and QA engineers maintaining the services supporting Cisco's AVC engine. The services included a network traffic data warehousing solution, tools for compiling and validating the engine's signatures, testing their efficacy, and finally publishing the updated signature bundle to thousands of devices that had AVC engine installed. The project spans three cities and two time zones.
2023 - 2024

Master's Degree in Applied Data Science

University of Michigan - Ann Arbor, MI

2008 - 2013

Master's Degree in Applied Linguistics

Lviv Polytechnic National University - Lviv Ukraine


Machine Learning Specialization



Statistics with Python Specialization

University of Michigan


Machine Learning Engineering for Production (MLOps) Specialization



Pandas, Scikit-learn, Dask, PySpark, React, AMQP, TensorFlow


Terraform, Apache Airflow, Celery, Jira, Helm, Splunk, Jenkins


Django, Flask, Spring Boot, Apache Spark, Akka, Play Framework, Hibernate, Django REST Framework


Object-oriented Programming (OOP), DevOps, ETL, Functional Programming, Data Science


Python, Java, Scala, SQL, HTML, JavaScript, Bash, Go, CSS


Amazon Web Services (AWS), Kubernetes, Docker, JBoss, Linux, Heroku, Apache Flink, Databricks


Elasticsearch, Redis, PostgreSQL, MySQL, MongoDB, InfluxDB, Data Pipelines, Data Lake Design, Datadog


Cloud Computing, Data Engineering, Big Data, Natural Language Processing (NLP), Fintech, Information Security, Generative Pre-trained Transformers (GPT), Machine Learning, Finance, FTP, Pipelines, Information Visualization, Statistics, Exploratory Data Analysis, Deep Learning, Machine Learning Algorithms, Linear Regression, Data Visualization, Statistical Data Analysis, Statistical Modeling

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring