Ievgen (Eugene) Morokin
Site Reliability Engineer and Developer
Eugene is an accomplished GTD DevOps and site reliability engineer (SRE) with six years of experience that include old-school Linux and an extensive array of technologies and tools. His professionalism stands on three pillars: attention to small details, perfectionism, and the ability to predict the unpredictable. Eugene is a quick study who excels at identifying the best technologies and solutions for each situation.
ExperienceTerraform - 6 yearsAmazon Web Services (AWS) - 6 yearsPython - 6 yearsAnsible - 6 yearsDocker - 4 years
Amazon Web Services (AWS), Geohash, OpenVAS, Git, PagerDuty, Opsgenie, Rsyslog, Fluentd, ELK (Elastic Stack), Tableau, Apache Airflow, Zeppelin, AWS EMR, Hadoop, Spark, ClickHouse, Tarantool, Memcached, Redis, PostgreSQL, MariaDB, MySQL, Okta, RabbitMQ, NATS, Envoy Proxy, HAProxy, Apache, OpenResty, NGINX, Google Cloud Platform (GCP), Apache ZooKeeper, Consul, Vault, Grafana, Prometheus, Jenkins, Groovy, Bash, Lua, Python, Docker, Ansible, Terraform
The most amazing...
...thing I've implemented from the ground up is an SLA monitoring and reporting system for a geographically distributed VPN infrastructure.
Lead Site Reliability Engineer
- Implemented a CD pipeline for automated deployment of a VPN stack to a production fleet (over 600 hosts), drastically reducing the time spent in toil work.
- Improved visibility, service quality, and customer experience, and reduced incident resolution time by implementing SLA monitoring and reporting.
- Implemented Geohash technology for proximity searches.
- Troubleshot and resolved complex tasks by providing a higher level of tech support for team members.
- Led a geographically distributed team of multilingual engineers in multiple countries, coached and mentored team members, and motivated people to achieve business and personal goals in a timely manner.
- Conducted on-site onboarding of a contractor team located in Costa Rica and Bolivia.
- Planned projects and sprints, conducted retrospectives and performance reviews for team members, ensured team success and efficiency, and reported to stakeholders.
Site Reliability Engineer
- Dockerized and migrated key parts of an on-site Hadoop/Spark cluster to AWS. Fine-tuned AWS EMR to increase stability, improve performance (faster ETL jobs processing), and reduce costs.
- Migrated an on-site legacy Tableau server to AWS and retained data. Implemented automated provisioning/deployment with Terraform and Ansible and monitoring with Prometheus. Drastically improved stability, performance, and report quality as a result.
- Collaborated with the SecOps team to implement golden images and drove end-to-end deployment across the production fleet in both cloud and bare metal, thereby significantly improving security and stability.
- Drove end-to-end implementation/deployment of a standardized naming schema across the production fleet.
- Trained and assisted team members on various topics, including best practices and documentation writing.
- Troubleshot networking and performance issues across production and worked closely with vendors and developers on resolutions.
- Developed and supported a custom AWS Cloud orchestration solution. Particularly responsible for an EC2 Spot instances Auto Scaling module that significantly reduced infrastructure costs.
- Performed migrations from shell script-based automation to Ansible and Terraform, continuously developed new roles and modules for application deployment and infrastructure provisioning.
- Designed Grafana dashboards based on InfluxDB, Prometheus, and ClickHouse data sources for advanced monitoring and troubleshooting, effective cost control, and for BI and product teams.
- Automated Hybrid Cloud (AWS, on-premise KVM) operations, significantly reducing time spent on toil work.
- Performed a vulnerability assessment with OpenVAS, including issues analysis and security hardening on production hosts, thereby drastically reducing the number of security incidents.
- Optimized backup procedures of a MySQL server fleet, thereby reducing backup time.
AWS EC2 Spot Instances Auto Scaling Solution
Terraform, Ansible, Jenkins, Grafana, Vault, Apache ZooKeeper, NGINX, Apache, Envoy Proxy, RabbitMQ, Apache Airflow, Tableau, ELK (Elastic Stack), Fluentd, Rsyslog, Git, KVM/Qemu, Packer, GitLab, VPN, Amazon Elastic MapReduce (EMR), Jira, GitHub, Docker Compose
Amazon Web Services (AWS), Docker, Google Cloud Platform (GCP), OpenResty, Zeppelin, PagerDuty, Linux, CentOS
Python, Lua, Bash, Groovy
Spark, Hadoop, AWS EMR, OpenVAS
MySQL, MariaDB, PostgreSQL, Redis, Memcached, Tarantool, ClickHouse, MongoDB, Aerospike, InfluxDB, On-premise, Redshift
Network Security, IT Security
Prometheus, Consul, HAProxy, NATS, Okta, Opsgenie, Geohash, Content Delivery Networks (CDN), Akamai, DNS Servers, SSL Certificates, SSL Configurations, Proxies, Twemproxy, Autoscaling, High-load, SecOps, Hybrid Cloud Infrastructure, Teams, Service-level Agreements (SLA), CI/CD Pipelines, Infrastructure, Application Security
Network Security Expert
Technion - Israel Institute of Technology