Dmitry Kireev, Developer in Walnut, CA, United States

Dmitry Kireev

Cloud Architect Developer

Location
Walnut, CA, United States
Toptal Member Since
November 21, 2019

Dmitry is a cloud architect and site reliability engineer with over a decade of intense professional experience strictly adhering to the DevOps methodology. He has architected and built multiple platform-agnostic infrastructures from scratch for modern cloud systems. Dmitry has a proven track record of hands-on operations in high-scale environments. He is also proficient with IaC, automation, scripting, as well as monitoring and observability.

Dmitry is available for hire
Hire Dmitry

Portfolio

HazelOps
Amazon Elastic Container Service (Amazon ECS), AWS DevOps, GNU Make...
Flo Technologies
AWS DevOps, GNU Make, Amazon Web Services (AWS), Transport Layer Security (TLS)...
Delphix
AWS DevOps, Amazon Web Services (AWS), Python, AWS CloudFormation, Foreman...

Location

Walnut, CA, United States

Availability

Part-time

Preferred Environment

AWS CloudFormation, GitLab, Terraform, Ansible, Linux, GitHub, Docker, Amazon Web Services (AWS), DevOps, SSL Certificates, Digital Certificates

The most amazing...

...thing I've architected, deployed, and managed is a scalable, highly available cloud for an IoT security product alongside the software engineering team.

Work Experience

2015 - PRESENT

Head of Site Reliability Engineering | Consultant

HazelOps
  • Built scalable infrastructures for startups: multi-environment, with infrastructure as code, self-healing, scalable, and predictable environments on AWS.
  • Took care of the legacy code for dockerizing JVM, PHP, and Python apps.
  • Analyzed and audited performance for dozens of full-cycle reports based on key factors of infrastructure performance and action items based on proposals.
  • Helped software engineers implement DevOps, including close communication, strategy, and processes improvement.
  • Instrumented site reliability practices by owning SLA, SLO, SLIs, eliminating toil, and increasing observability—automation, monitoring, and error budgeting.
  • Implemented CI/CD, facilitating a streamlined deployment pipeline for dozens of different projects, including GitLab, Jenkins, and CircleCI. Utilized Docker, registry, and multi-stage builds.
  • Created OPS procedures in customers' environments, including service-based alerting, on-call rotation, and escalations.
  • Deployed and maintained Apache Kafka, including full-cycle management via Terraform, Ansible, and Docker.
Technologies: Amazon Elastic Container Service (Amazon ECS), AWS DevOps, GNU Make, Amazon Web Services (AWS), Grafana, Traefik, HAProxy, Python, WordPress, PHP, Java, Serverless, ECS, Docker Swarm, Docker, Ansible, AWS CloudFormation, Terraform, NGINX, DevOps, SSL Certificates, Digital Certificates
2016 - 2019

Lead Site Reliability Engineer

Flo Technologies
  • Designed and executed a complex IoT infrastructure from scratch on AWS: multi-tier, multi-subnet scalable cloud AWS infrastructure, multi-application stateless stack with Elastic Beanstalk and ECS and Docker, platform-agnostic local workspaces with Docker.
  • Created and administered Ansible infrastructure: idempotent plays and roles to support infrastructure needs and wrote community-available roles for multiple platforms under Apache Foundation.
  • Designed and implemented CI/CD: complete application lifecycle with green deployments of high-traffic services, platform-agnostic framework to support SaaS or hosted CI servers, and hassle-free pipelines for software engineers.
  • Constructed and administered monitoring solutions: log and data aggregation from multiple sources (ELK), on-prem monitoring via TICK, Grafana. SaaS monitoring with Datadog and New Relic when needed.
  • Devised and executed operational procedures: service-oriented OLA, Pagerduty with monitoring solutions, and Pagerduty "Service Owner First" policy.
  • Created and maintained an upgrade procedure for critical distributed systems to allow no-downtime and no-data loss upgrades for the whole three-year time span.
Technologies: AWS DevOps, GNU Make, Amazon Web Services (AWS), Transport Layer Security (TLS), Linux, CircleCI, Docker, TICK Stack, ELK (Elastic Stack), GitLab, Apache Kafka, Ansible, AWS CloudFormation, Terraform, DevOps, SSL Certificates, Digital Certificates
2016 - 2017

Senior Member of Technical Staff

Delphix
  • Architected and implemented multi-tier hybrid cloud AWS infrastructure for a new project for a high-scale testing framework.
  • Constructed log and data aggregation from multiple sources (ELK).
  • Created a virtual and bare-metal host provisioning system (Foreman).
  • Designed and implemented Nmap-based inventory software.
  • Contributed to company-wide IT processes and improvements.
  • Came up with major portions to on-call rotation, monitoring, SOA, and OLA designs and implementations.
Technologies: AWS DevOps, Amazon Web Services (AWS), Python, AWS CloudFormation, Foreman, Ansible, ELK (Elastic Stack), Jenkins, Terraform, DevOps, SSL Certificates, Digital Certificates
2013 - 2016

Senior DevOps Engineer

Intuit
  • Managed a hybrid cloud with around 300 nodes: AWS, VMware, and bare metal.
  • Implemented automation, config management, and provisioning: 90% of the environment is in Puppet and Git.
  • Managed the lifecycle of legacy systems. .NET, C#, and automation of manually deployed systems.
  • Provided CI in configuration management and IaaC: GitFlow, reusable code, and open-source contribution.
  • Managed and mentored junior IT staff, including separation of concerns and easy onboarding.
  • Led most of the post-acquisition infrastructure integration projects.
Technologies: AWS DevOps, Amazon Web Services (AWS), Foreman, Git, TeamCity, ELK (Elastic Stack), Puppet, Terraform, DevOps, SSL Certificates, Digital Certificates
2011 - 2013

DevOps Engineer

Docstoc (Acquired by Intuit)
  • Supported colocation with 180+ Windows and Linux dedicated servers as well as new server deployment.
  • Managed network security and performance (Juniper SSG, SRX Firewalls, A10 networks Load Balancer, Radius, IPsec, NAT, Amazon EC2 VPC).
  • Implemented proactive monitoring using Nagios, ELK, and New Relic.
  • Optimized Linux and Windows server performance for high scale.
  • Deployed and maintained on-premise MySQL databases.
  • Introduced and implemented ELK stack, Elasticsearch, Logstash, Kibana.
Technologies: Amazon Web Services (AWS), AWS DevOps, Nagios, Bash, Python, MongoDB, MySQL, LB, Juniper, DevOps, SSL Certificates, Digital Certificates

Experience

ICMK - Infrastructure as Code Make Framework

https://github.com/hazelops/icmk
This framework is an attempt to create a convenient way to manage infrastructure as code with a low barrier of entry for the runner.

The idea is to use GNU Make as a vehicle for wrapping the complexity and presenting a nice runner experience.

This way, a coherent set of commands can be used locally or on the CI, as simple as "make deploy."

Article: Runner Experience Design

https://automationd.com/runner-experience-design/
I’m an adept of a Credo of Phoenix approach when we talk about infrastructure design: Whatever you build should have an ability to be rebuilt with no-to-minimal effort over and over again by anyone or anything with sufficient permissions.

While such a poetic way of calling Idempotent Infrastructure has many important technical characteristics, this time, I’d like to talk about the other side of it: “anyone or anything with sufficient permissions” - runners and their experience.

Article: How to Avoid Human Bottlenecks in Production

https://automationd.com/how-to-avoid-human-bottlenecks-in-production/
There is no doubt we’ve all heard of a term “bottleneck”: A bottleneck is one process in a chain of processes, such that its limited capacity reduces the capacity of the whole chain ( Wiki).

Generally speaking, it is required to have multiple humans to run a larger business to perform ideation, design, project management, development, QA, marketing and infrastructure operations. When a single human limits a capacity of a team it becomes a Human Bottleneck.

In this post I’d like to highlight two distinct types of Human Bottlenecks, which both can make a negative impact on the productivity of the team from the prospective of Operations and Site Reliability.

OpenVPN AS Docker + DUO Security

https://github.com/AutomationD/docker-openvpnas
This image incorporates OpenVPN Access Server with Duo Security 2 factor auth. All configuration is done via environment variables, for example: OPENVPN_VPN__DAEMON__0__LISTEN__IP_ADDRESS is mapped to vpn.daemon.0.listen.ip.address, which is searched in present configuration files (as.conf and config.json), which is set to a value of an env var.

Duo Security is optional but is highly recommended, since basic account is free. All you need to do is get API credentials and enable post-auth script.

Windows Imaging Toolkit

https://github.com/AutomationD/wimaging
WImaging is a set of scripts to prepare WIM images and templates for Foreman to provision Windows hosts. Most of the time, official Microsoft deployment tools are used—mostly dism.exe.

All relevant configuration files like unattend.xml are rendered by Foreman and downloaded at build time.

Skills

Tools

GNU Make, Ansible, AWS CloudFormation, ELK (Elastic Stack), GitLab, GitLab CI/CD, Terraform, Docker Compose, Grafana, Telegraf, CircleCI, Travis CI, Traefik, Amazon CloudWatch, Amazon Elastic Container Service (Amazon ECS), GitHub, Docker Swarm, NGINX, Puppet, Jenkins, TeamCity, Git, Nagios, Makefile, AWS CodeDeploy

Paradigms

Agile, Continuous Delivery (CD), Continuous Integration (CI), DevOps, Automation, Agile Software Development

Platforms

Docker, Amazon Web Services (AWS), AWS Elastic Beanstalk, Amazon EC2, Apache Kafka, JVM, Heroku, Linux, Azure, WordPress

Other

GitHub Actions, AWS DevOps, SSL Certificates, Digital Certificates, Networking, TICK Stack, Transport Layer Security (TLS), Foreman, Juniper, LB, ECS, Serverless, Site Reliability Engineering (SRE), HAProxy

Languages

Python, Bash, SQL, Java, PHP, Markdown, Go, JavaScript

Frameworks

Flask

Storage

MySQL, MongoDB, InfluxDB, Elasticsearch, Redis, MySQL/MariaDB

Libraries/APIs

Node.js

Education

2006 - 2009

Bachelor's Degree in Business Communication (English)

Tula State University - Tula, Russia

2004 - 2009

Bachelor's Degree in Economics and Business Administration

Tula State University - Tula, Russia