Adrian Sandulescu, Developer in Bucharest, Romania
Adrian is available for hire
Hire Adrian

Adrian Sandulescu

Verified Expert  in Engineering

Software Developer

Location
Bucharest, Romania
Toptal Member Since
July 29, 2019

Adrian has over seven years of experience working with near-petabyte-scale big data applications based on microservices. He also has extensive experience in automating, monitoring, and deploying complex microservice architectures. Adrian is a versatile professional looking forward to his next role.

Availability

Part-time

Preferred Environment

Ruby, Bash, Git, Emacs, Ubuntu

The most amazing...

...analytics pipeline I have automated processes billions of daily events using cutting edge tech like Apache Druid, Flink, and Kafka.

Work Experience

Devops Engineer

2012 - PRESENT
Adswizz
  • Automated and deployed both Lambda and Kappa big data analytics pipelines.
  • Automated and deployed countless microservices.
  • Reduced operational costs by identifying inefficiencies and implementing new technologies.
  • Designed and wrote in house Puppet modules.
  • Structured Cloudformation template deployment scheme using Troposphere and Sceptre.
  • Established Puppet deployment flow and external module structure.
  • Introduced and implemented Kubernetes.
  • Introduced and implemented the immutable infrastructure paradigm for deployments.
  • Introduced and implemented Spinnaker for industry-leading deployment automation.
  • Introduced and implemented Prometheus.
  • Developed CI/CD pipelines using Jenkins and Concourse.
  • Dockerised applications.
  • Migrated applications to Kubernetes.
Technologies: Apache ZooKeeper, Amazon EKS, Concourse CI, lighttpd, AWS CloudTrail, Kubernetes Operations (kOps), HAProxy, AWS Elastic Beanstalk, Amazon Virtual Private Cloud (VPC), Immutable Infrastructure, Sceptre, AWS CLI, AWS Auto Scaling, AWS ELB, Amazon CloudFront CDN, Amazon DynamoDB, Apache2, Apache Tomcat, Amazon Elastic Container Registry (ECR), Redshift, Apache Flink, AWS Lambda, Amazon Kinesis, AWS CodeCommit, Amazon ElastiCache, AWS CodeDeploy, Amazon Simple Email Service (SES), MySQL, Amazon CloudWatch, Amazon Elastic MapReduce (EMR), Amazon Glacier, Amazon Route 53, AWS Cloud Computing Services, Amazon S3 (AWS S3), Amazon EC2, AWS IAM, AWS CloudFormation, Amazon EBS, Elasticsearch, MongoDB, HBase, Druid.io, Apache Kafka, Hadoop, Flink, Jenkins, Spinnaker, Kubernetes, Docker, Prometheus, ELK (Elastic Stack), Grafana, Graphite, Nagios, Terraform, Troposphere, Puppet, Packer, Bash, Go, Python, Ruby, Amazon Web Services (AWS)

Sysadmin

2010 - 2012
Horia Hulubei National Institute of Physics and Nuclear Engineering
  • Maintained department web and email servers.
  • Deployed and maintained bare metal grid computing infrastructure.
  • Deployed and provided support for various scientific software suites.
  • Deployed and provided support for personal user workstations.
  • Provided hardware support for servers, workstations, and printers.
Technologies: Fedora, CentOS, Ubuntu, Sendmail, Apache, Nagios, Puppet, Bash

Spot Fleet with EBS Reattach for Druid.io

While working on one Kappa architecture analytics pipeline I ran into issues with the database nodes being very expensive.
The only solution to significantly lower the costs was to use the AWS cloud spare capacity (SPOT), made available at a much lower cost,
but with an extremely high risk of losing the virtual machines. Losing the VMs could sometimes happen even multiple times per day.

The lost VMs could be replaced but this still meant that any new VMs had to reload data from cloud storage (S3) over the course of several hours in order to become fully operational.
During this time, the database would lose replication and would be vulnerable to downtime, a risk we couldn't take.

In order to solve this, I used a script that would allow newly launched VMs to reuse the virtual hard disks left behind by any lost VMs, thus allowing them to become available as soon as they were launched.

Introduced Spinnaker

While working with high volume Tomcat application deployed on hundreds of nodes I ran into issues with the deployment procedure being both too slow and too time-consuming (not completely automate due to being too complex).

The deployment procedure involved both a manual canary step (where only one of the servers would be updated and monitored for errors pending approval) and a rolling update step (where the entire fleet would be updated a few VMs at a time).

Due to the number of VMs that needed to be updated the entire deployment could take up to 30 minutes but what's worse is that a rollback could also take up to 30 minutes.

In order to improve this, I implemented Spinnaker, a tool that allows automating all deployment steps (including canary and manual approval by stakeholders) as a deployment pipeline and also allows running red/black deployments.

In a red/black deployment, a completely new set of VMs is provisioned and traffic is routed to them. In order to rollback traffic simply needs to be routed back to the VMs deployed with the previous application version meaning that rollbacks could now be performed in seconds instead of tens of minutes minimizing the impact of failed deployments.

Implemented Kube2Iam in Kubernetes

While working with applications that required access to AWS cloud services while being deployed in Kubernetes I ran into an issue with using dynamically generated credentials (to ensure proper rotation) while also making sure that applications could not use each other's credentials.

While AWS does facilitate using dynamic credentials by assigning a role to each VM, this would mean that all pods running on a VM would have the same access policy.

In order to solve this issue, I adopted an application called Kube2Iam that proxies access to AWS' credential server and allows configuring separate access policies to each pod running on a VM.

Kafka Virtual Hard Disk Performance Optimization

While working with Kafka in AWS I ran into an issue with using throughput optimized virtual hard disks on the Kafka VMs.

The maximum throughput of the disks was several times lower than what was advertised, meaning both a performance penalty and a cost penalty if we were to switch to using SSDs.

Digging through the documentation I found that the advertised throughput is only guaranteed if write size is at least 1MB.

Kafka is optimized specifically for efficient, large sequential writes so this was most likely a kernel configuration issue.

Searching for more information online I found that the maximum write size for the OS we were using was limited to 256KB and needed a boot parameter change in order to allow being increased after boot. A new VM image was created with the required changes thus solving the issue.

Introduced the Immutable Infrastructure Paradigm

Working extensively with Puppet for automating server provisioning I ran into the issue of new servers taking a very long time to provision since every new server would need to install and configure all required packages.

In order to tackle this limitation, I pioneered and encouraged the shift to use immutable infrastructure, using Packer to create VM images.

Since new servers would now be provisioned from pre-configured images launching them would be several times faster.

As an added benefit, configuration drift would no longer be possible.

Languages

Bash, Ruby, Python, Go

Tools

AWS ELB, AWS CLI, Amazon EBS, AWS CloudFormation, AWS IAM, Puppet, Amazon Elastic MapReduce (EMR), Amazon Elastic Container Registry (ECR), Amazon Simple Email Service (SES), AWS CodeDeploy, Amazon ElastiCache, AWS CodeCommit, Amazon CloudWatch, Apache ZooKeeper, Packer, Nagios, Grafana, ELK (Elastic Stack), Amazon Virtual Private Cloud (VPC), Emacs, Git, Flink, Apache, Sendmail, AWS CloudTrail, Apache Tomcat, lighttpd, Terraform, Jenkins, Concourse CI, Amazon EKS, Amazon CloudFront CDN, Apache Druid

Platforms

Amazon EC2, AWS Cloud Computing Services, AWS Lambda, AWS Elastic Beanstalk, Apache Flink, Apache Kafka, Docker, Amazon Web Services (AWS), Ubuntu, CentOS, Fedora, Apache2, Kubernetes, Spinnaker

Storage

Amazon S3 (AWS S3), Redshift, Druid.io, HBase, MongoDB, MySQL, Amazon DynamoDB, Elasticsearch

Other

AWS Auto Scaling, Amazon Kinesis, Amazon Glacier, Amazon Route 53, Sceptre, Troposphere, Graphite, Immutable Infrastructure, HAProxy, Prometheus, Kubernetes Operations (kOps)

Frameworks

Hadoop

2008 - 2010

Master's Degree in Investment Management

Academy of Economic Studies - Bucharest, Romania

2005 - 2008

Bachelor's Degree in Business Administration

Academy of Economic Studies - Bucharest, Romania

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring