Roman Gershkovich, Developer in Moscow, Russia
Roman is available for hire
Hire Roman

Roman Gershkovich

Verified Expert  in Engineering

Software Developer

Location
Moscow, Russia
Toptal Member Since
July 4, 2019

Roman launched his Linux engineering career at Yandex, one of the leading IT companies in Russia. Since then, he’s worked for several tech businesses—most notably spending two years at Amazon Web Services. Roman's built scalable & reliable infrastructures, worked with bare metal & cloud solutions, maintained Kubernetes, built CI/CD processes in Jenkins & GitLab CI, and implemented infrastructure-as-code automation and testing (Puppet & RSpec).

Portfolio

OZON
PostgreSQL, Foreman, Kubernetes
Lazada (Alibaba Group)
GitLab, Puppet, Aerospike, Elasticsearch, Prometheus, Kubernetes, Docker
Amazon Web Services
Amazon Simple Queue Service (SQS), Amazon Route 53, AWS CloudFormation...

Experience

Availability

Part-time

Preferred Environment

CentOS, Ubuntu, Kubernetes, Terraform, Puppet

The most amazing...

...thing I’ve done was to design, deploy, and maintain Kubernetes clusters in multiple environments for nearly 50 teams which produce 750+ apps/year at OZON.

Work Experience

Linux Platform Team Leader, Infrastructure Services

2018 - 2019
OZON
  • Oversaw and was responsible for building the team to create and maintain the Linux infrastructure of transition project; the project involved migrating from a monolithic Windows-based infrastructure with MS SQL databases to Go microservices with smaller dedicated PostgreSQL database.
  • Selected the hardware to purchase.
  • Set up Foreman and Puppet with a full CI/CD multi-master for HW/VMs (libvirtd) provisioning and configuration management.
  • Created the necessary bits and pieces like GitLab installation, apt-mirror, my own package repository, image registry, and more.
  • Implemented several Kubernetes clusters for different environments; coming a long way from virtualized nodes and Kubeadm to 250 containers per physical machine and a custom Puppet module for management.
Technologies: PostgreSQL, Foreman, Kubernetes

Senior Systems Engineer, Infrastructure Services

2016 - 2018
Lazada (Alibaba Group)
  • Oversaw and was responsible for the low- to middle-level part of Lazada’s infrastructure: hardware provisioning, preparing systems for deployments, automating databases according to DBA requirements, and providing monitoring stats.
  • Managed the Puppet installation in multiple data centers, reviewed and approved MRs for Puppet code, managed CI/CD pipeline for environment updates, tests, and more.
  • Migrated part of the Elasticsearch index to Aerospike in persistent storage mode with SSDs.
  • Transitioned from Supervisord to Docker for services deployments.
  • Enabled inventory and provisioning by using FusionInventory, GLPI, Cobbler, and xCAT.
Technologies: GitLab, Puppet, Aerospike, Elasticsearch, Prometheus, Kubernetes, Docker

Systems Engineer, Amazon WorkMail

2015 - 2016
Amazon Web Services
  • Worked on scenarios for the automated expansion of WorkMail to new AWS regions using CloudFormation and internal tooling.
  • Took part in the development of an in-house Python software for instance management (full recycling during deployments, max uptime management, and other special requirements).
  • Managed the Jenkins CI operations (~150 Linux/Windows slaves).
  • Made performance and cost optimizations by testing/selecting optimal instance types with migrating to spot instances where possible.
  • Supported existing stacks and wrote new Chef Solo manifests for all stacks in OpsWorks.
Technologies: Amazon Simple Queue Service (SQS), Amazon Route 53, AWS CloudFormation, AWS CloudTrail, Amazon CloudWatch, Amazon S3 (AWS S3), AWS OpsWorks, Amazon Virtual Private Cloud (VPC), Amazon EC2

System Administrator and Team Leader, Yandex.Market

2011 - 2014
Yandex
  • Transitioned from IP v4 to IP v6 for internal data transfers.
  • Migrated from lighttpd to Nginx for more than 50 virtual hosts.
  • Handled MySQL administration, backups/restorations, and slave load balancing.
  • Unified the OS version used, upgraded all of the fleet (~1,500 physical and virtual boxes to 12.04 Bionic) where possible and containers where needed.
  • Optimized hardware usage patterns and reduced the number of user configurations.
  • Migrated two acquired services to Yandex's internal infrastructure.
Technologies: MongoDB, MySQL, Graphite, libvirt, LXC, KVM, Ubuntu

Migration from IP v4 to IP v6 for Internal Data Transfers, Early 3.x Kernel

Due to the new networking design in modern DCs without IP v4 available for the "fastbone" network, we had to migrate to IP v6 for internal data transfers at Yandex.Market. It was a challenging task because of multiple factors:

01. We were inexperienced with running dual-stacked systems (IP v4 was needed for the "backbone" network to serve real user traffic).
02. At the time, IP v6 implementation in a Linux kernel was pretty buggy.
03. We used BitTorrent for internal data transfers, so we had to find a tracker and a client that properly supported IP v6; it was not straightforward back then (ended up with Opentracker and Aria2c),
04. Even the basic infrastructure needed to be updated to work with IP v6, e.g., automation to generate reverse DNS for AAAA records.
05. A rollback plan was a must since this process of updating data was business-critical.

I was not the only one working at this project, but I was responsible from the infrastructure side, i.e., choosing correct open-source solutions, setting up and testing IP v6 everywhere, and cooperating with networking operations for debugging purposes. Eventually, we managed to solve the issues above and fully migrate to IP v6—genuinely satisfying!

Production-grade Kubernetes on Bare Metal at OZON

Managing Kubernetes is hard. While bringing up GKE or EKS clusters and deploying sample "Hello world" apps is made super-easy, running Kubernetes on your own hardware and actually managing configurations without relying on kubeadm to fully control what's going on in your cluster is a totally different story.

Choosing the right CNI plugin, thinking about how to organize load balancing, certificate management, version upgrades, compatibility between different components, optimal hardware usage with full (but not critical) saturation, and so on—all this needs planning and careful execution of changes.

At OZON, I was the point person responsible for these operations and bootstrapped several clusters with Calico, iBGP peering for L3 routable container network, Puppet-managed configurations, and dynamic node provisioning. The architecture evolved significantly from KVM nodes with limited resources to full hardware nodes running up to 250 containers per-machine simultaneously. I also took part in common pipeline development for deployments, organized fair requests/limits for namespaces, and integrated Vault for secrets and more.
2007 - 2012

Master's Degree in Information Technology, Applied Informatics in Economics

Russian State Tax Academy - Moscow, Russia

Tools

Puppet, NGINX, RSpec, GitLab CI/CD, GitLab, Git, KVM/Qemu, AWS OpsWorks, AWS CloudFormation, Amazon Simple Queue Service (SQS), Amazon Virtual Private Cloud (VPC), Terraform, Amazon CloudWatch, AWS CloudTrail, Google Kubernetes Engine (GKE), Jenkins, TeamCity, lighttpd

Platforms

Ubuntu, CentOS, Kubernetes, Docker, KVM, Amazon EC2

Languages

Python, Bash Script, Ruby

Libraries/APIs

libvirt

Storage

MySQL, IP Virtual Server (IPVS), PostgreSQL, Aerospike, Memcached, MongoDB, Amazon S3 (AWS S3), ClickHouse, Elasticsearch

Other

Virtual Router Redundancy Protocol (VRRP), Foreman, Packaging, Prometheus, HAProxy, LXC, Amazon Route 53, Cobbler, Graphite

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring