Cyril Shcherbin, Data Engineering Developer in Prague, Czech Republic
Cyril Shcherbin

Data Engineering Developer in Prague, Czech Republic

Member since February 12, 2015
Cyril is a seasoned software engineer with a passion for cloud and distributed computing, data engineering, and natural language processing. He has gained a great deal of expertise developing solutions in information security and financial technology problem domains. Since 2018, Cyril has been a member of a research team conducting experiments on applying machine learning techniques to tackle network and endpoint security challenges.
Cyril is now available for hire




Prague, Czech Republic



Preferred Environment

Amazon Web Services (AWS), IntelliJ IDEA, MacOS

The most amazing...

...thing I've designed is a cloud-based data lake and an auto-scaling system for distributed training and tuning of machine learning models


  • Research Engineer

    2018 - PRESENT
    Cisco Systems
    • Designed and developed a data lake and an auto-scaling system for distributed training and tuning of machine learning models.
    • Collaborated on a decision tree model for the static analysis layer of an endpoint security application.
    • Improved batch job scheduling and optimized cloud computing costs.
    Technologies: Dask, Python, Java, Machine Learning, Big Data, Amazon Web Services (AWS), Kubernetes, Apache Spark, Apache Airflow, Scala, Apache Flink, SQL, Jira
  • ML/Data Platform Architect

    2020 - 2022
    SNAFU Records (via Toptal)
    • Architected the company’s data ingestion, storage, and processing platform.
    • Designed and developed the company’s Artist – Producer matchmaking services.
    • Collaborated on due diligence and investment models.
    Technologies: Apache Airflow, Apache Spark, Amazon Web Services (AWS), Python, Machine Learning, Scala, Databricks, Data Engineering, Data Lake Design, Terraform, Datadog
  • Contract Senior Software Engineer

    2017 - 2019
    Private Client (via Toptal)
    • Designed and developed the core UI components, back-end services, API gateway, and cloud infrastructure.
    • Implemented the system integrations with third-party cloud image storage (Dropbox and Google Drive).
    • Developed and implemented the system's synchronization, deduplication, and reverse image search services.
    • Designed the subscription model and integrated the system with a payment gateway.
    Technologies: Amazon Web Services (AWS), Django REST Framework, Celery, Django, React, JavaScript, Heroku, AWS, Python, SQL
  • Senior Software Engineer

    2017 - 2018
    Cisco Systems
    • Implemented a GraphQL interface on the back end for the system's new UI.
    • Designed and implemented the project's secret management, monitoring, and centralized logging solutions.
    • Facilitated the project's effort to migrate to AWS.
    • Provisioned and maintained a Kubernetes cluster for testing, staging, and production environments.
    Technologies: Amazon Web Services (AWS), Python, Scala, Kubernetes, Puppet, Terraform, AWS, React, JavaScript, PostgreSQL, Spring, Java, SQL, Jira
  • Software Engineer

    2015 - 2017
    Deutsche Börse Group
    • Created a framework for QA engineers to perform integration and regression testing of the company's clearing (finance) and security services.
    • Analyzed, estimated, and fulfilled functional and nonfunctional requirements for complex financial market features.
    • Maintained the company's clearing (finance) and security services. Duties included, but were not limited to: analysis, bug fixing, refactoring, addressing performance issues, and internal 3L on-call support for financial market operations engineers.
    Technologies: Jenkins, Python, PostgreSQL, AMQP, JBoss EAP, Hibernate, Spring, Enterprise Java Beans (EJB), Java EE, SQL, Jira
  • Contract Software Engineer

    2015 - 2015
    Restoration Media (via Toptal)
    • Designed, developed, and maintained a cloud-based, highly scalable, and efficient ETL system to handle reporting event data from the company's targeted email marketing campaigns.
    • Designed and developed various webhooks, parsers, and clients for third-party APIs.
    • Facilitated the company's effort to create a single data warehousing solution.
    • Designed and developed data aggregation tools and dashboards for the company's data analysts and email marketing operations managers.
    • Performed numerous migrations from the company's obsolete databases with ad hoc schema updates and modifications.
    Technologies: Amazon Web Services (AWS), ETL, JavaScript, AWS, Bootstrap, Redis, Google BigQuery, MySQL, Celery, Django REST Framework, Django, Python, SQL
  • Lead Contract Software Engineer

    2013 - 2015
    Cisco Systems (via SoftServe)
    • Maintained a database and a file system storage for a corpus of HTTP transactions.
    • Developed a continuous deployment and monitoring strategy for the system and supervised its implementation process.
    • Contrived and implemented an object model for manipulating and validating various versions of configuration files.
    • Designed and deployed a multiprocessing system for validating HTTP-capturing signatures and testing their performance on multiple engines in all possible configurations.
    Technologies: Django REST Framework, Linux, Jenkins, MySQL, Django, Python, SQL, Jira, Amazon Web Services (AWS)


  • Cisco Cognitive Threat Analytics

    I am currently a member of an R&D team responsible for researching cutting-edge ML technologies applied to the network security problem domain. The team also collects and documents global threat intelligence and maintains a cloud-based SIEM and UEBA solution that processes hundreds of terabytes of data daily.

  • Digital Artists’ Copyright Infringement Protection Platform (via Toptal)

    I was hired by a Toptal client to assist in the design and development of a SaaS application, which aimed at protecting digital artists against copyright infringement. My main responsibilities were designing and implementing the cloud infrastructure, back-end services, and pipelines to synchronize the photos and videos from cloud storage providers and scan the internet for their occurrences, allowing the artists to identify infringers and take legal action. I also had to integrate the system with a payment gateway and design the subscription models, REST API, and core UI components.

  • ETL Pipeline for Email Marketing Campaigns (via Toptal)

    A Toptal client approached me with a simple data migration problem. However, it turned out that the client was considering a data warehousing solution and an analytics platform. I ended up building a cloud-based, scalable ETL system for processing reporting event streams from email marketing. My responsibilities later pivoted to developing several aggregation jobs and dashboards for marketing campaign managers. It was the most amazing thing that I have ever single-handedly built for quite some time.

  • Eurex Clearing’s C7

    While working for the world’s third-largest derivatives exchange, I joined a boring (at first sight) enterprise effort to modernize its old and rusty monolithic mainframe system into a service-oriented architecture. I have experienced firsthand all the pleasantries of taking apart an archaic codebase and building microservices while retaining the German quality of a banking application with no tolerance for the slightest mistake. I have learned a great deal about finance, quality assurance, and on-call support.

  • Cisco Application Visibility and Control (AVC)

    I have led a team of software developers and QA engineers maintaining the services supporting Cisco's AVC engine. The services included a network traffic data warehousing solution, tools for compiling and validating the engine's signatures, testing their efficacy, and finally publishing the updated signature bundle to thousands of devices that had AVC engine installed. The project spans three cities and two time zones.


  • Languages

    Python, Java, Scala, SQL, HTML, JavaScript, Bash, Go, CSS
  • Frameworks

    Django, Flask, Spring Boot, Apache Spark, Akka, Play Framework, Hibernate, Django REST Framework
  • Tools

    Terraform, Apache Airflow, Celery, Jira, Helm, Splunk, Jenkins
  • Paradigms

    Object-oriented Programming (OOP), DevOps, ETL, Functional Programming
  • Platforms

    Amazon Web Services (AWS), Kubernetes, Docker, JBoss, Linux, Heroku, Apache Flink, Databricks
  • Storage

    Elasticsearch, Redis, PostgreSQL, MySQL, MongoDB, InfluxDB, Data Pipelines, Data Lake Design, Datadog
  • Other

    Cloud Computing, Data Engineering, Big Data, Natural Language Processing (NLP), Fintech, Information Security, Machine Learning, Finance, FTP, Managing Machine Learning Production Systems, Deployment Pipelines, Model Pipelines, Machine Learning Engineering for Production
  • Libraries/APIs

    Pandas, Scikit-learn, Dask, PySpark, React, AMQP


  • Master's Degree in Applied Linguistics
    2008 - 2013
    Lviv Polytechnic National University - Lviv Ukraine


  • Machine Learning Engineering for Production (MLOps) Specialization

To view more profiles

Join Toptal
Share it with others