Aleksander Luiz Lada Arruda, Distributed Computing Developer in São Paulo - State of São Paulo, Brazil
Aleksander Luiz Lada Arruda

Distributed Computing Developer in São Paulo - State of São Paulo, Brazil

Member since September 13, 2018
Aleksander is a DevOps and site reliability engineer with an abundance of experience with cloud-native technologies. Along with having a bachelor’s degree in computer science, he’s deployed and managed production-grade clusters—like Kubernetes, Kafka, and Elasticsearch—and worked on microservice architecture and everything that comes with it including container orchestration, service discovery, message queues, monitoring, logging, and tracing.
Aleksander is now available for hire


  • Pypestream, Inc.
    Kubernetes, Elasticsearch, Ceph, Jenkins, Ansible, Prometheus, Grafana
  • Nezasa AG
    Amazon Web Services (AWS), Terraform, Jenkins, HAProxy, Kong, Heroku
  • Audsat
    Amazon Web Services (AWS), Kubernetes, Elasticsearch, Fluentd, GoCD, Datadog...


  • Linux, 10 years
  • Distributed Computing, 7 years
  • C++, 6 years
  • Continuous Delivery (CD), 4 years
  • Go, 2 years
  • Kubernetes, 2 years
  • Chef, 2 years
  • Terraform, 2 years


São Paulo - State of São Paulo, Brazil



Preferred Environment

Fast-paced with highly skilled professionals.

The most amazing...

...thing I’ve written was a multi-cluster Kafka setup providing very high availability to receive incoming app data from a company with over a billion downloads.


  • Senior DevOps Engineer

    2019 - PRESENT
    Pypestream, Inc.
    • Created lots of Jenkins pipelines with Groovy for deploying both infrastructure and applications in Kubernetes.
    • Provided on-call support 24/7 and was responsible for dealing with any kind of operational issues that can come by.
    • Deployed and upgraded well-known clusters and databases.
    • Developed several solutions for backing up clusters and applications.
    • Containerized multiple applications.
    Technologies: Kubernetes, Elasticsearch, Ceph, Jenkins, Ansible, Prometheus, Grafana
  • DevOps Consultant

    2018 - 2019
    Nezasa AG
    • Modified the deployment of multiple parts of their infrastructure—such as HAProxy, MongoDB and Jenkins—to use Terraform.
    • Fixed an issue in which cloning the production MongoDB to the staging one would take almost a day—resulting it being accomplished in much less time.
    • Set up a better API gateway for their Heroku deployments.
    • Managed their application's lifecycle, deploying new releases and hotfixes in staging and promoting them to production after all tests ran smoothly.
    • Monitored live logs and reported major bugs caught in the production environment.
    Technologies: Amazon Web Services (AWS), Terraform, Jenkins, HAProxy, Kong, Heroku
  • Lead DevOps

    2018 - 2019
    • Set up three Kubernetes clusters for development, staging, and production environments. The production cluster was set up as MultiAZ, with private topology, autoscaling, restrictions of the RBAC credentials per user, daily backups, and constant monitoring through Datadog and PagerDuty. As of today, I am the one responsible for guaranteeing the SLA of all the clusters.
    • Established GoCD with custom Elastic Agents for deploying the company’s applications into all three Kubernetes clusters. The agents run within spot instances automatically provisioned by Kubernetes. All applications are containerized and deployed as Helm packages.
    • Implemented automatic provisioning and renewal of Let’s Encrypt TLS certificates.
    • Deployed Fluentd daemon sets for the collection of logs of all the applications and sent them to AWS- provisioned Elasticsearch clusters (one for each Kubernetes); also deployed Elasticsearch-curators for cleaning old logs.
    • Set up the automatic monitoring of all Java applications deployed in the cluster by scraping Kubernetes pods with JMX ports exposed.
    • Spearheaded project Navalis which is a web application intended to allow developers to deploy, monitor, and scale their applications in multiple Kubernetes clusters with ease. It is currently under development, designed with Go and Vue.js.
    • Scaled Kubernetes up to 250 nodes to process batches within a few hours.
    Technologies: Amazon Web Services (AWS), Kubernetes, Elasticsearch, Fluentd, GoCD, Datadog, PagerDuty, Java
  • DevOps Engineer

    2017 - 2018
    TFG Co
    • Worked in 24/7 on-call rotations.
    • Deployed multiple MongoDB clusters for collecting data during a high-traffic event.
    • Designed, in partnership with our data engineering team, a new Kafka cluster for the company that was inspired by Netflix’s way of orchestrating and monitoring Kafka. The cluster was entirely written with Terraform and Chef and had a few components deployed to Kubernetes with Helm charts (Confluent’s REST API, MirrorMaker, and Burrow). All of the components would scale and send health metrics to Datadog automatically.
    • Developed a system for monitoring backups, consisting of a Python/Flask server and a client written in Go. The system would centralize all EBS and RDS snapshot statuses in a single place, along with other backups stored in S3 like GitLab and Redis. That was useful whenever a backup failed, triggering an alarm in Datadog and PagerDuty, alerting whoever was on-call of the failed backup.
    • Created a redundant VPN between availability zones (US/AP) in AWS using VyOS.
    • Developed a tool for cross-validating the Kubernetes network which would establish a route between every machine in Kubernetes generating a complete graph or pointing out issues in the network.
    • Solved an issue with our Elasticsearch cluster which used to crash at the beginning of each day; it was caused by an excessive amount of shards and a bunch of misconfigured Logstash instances which would flood the cluster with requests when those shards were being created. Solved that issue by reducing the number of shards, increasing the batch size sent by Filebeat to Logstash and reducing the number of open connections from Logstash to Elasticsearch.
    • Helped instrument our most important servers with Jaeger APM.
    • Deployed a Kubernetes cluster with autoscaling as a proof-of-concept in order to test how well a Kafka cluster would scale within Kubernetes.
    • Solved an issue in which our Kafka cluster would crash because of unexpected behavior of a tool someone had installed to monitor ZooKeper (Netflix Exhibitor).
    • Deployed a Kubernetes cluster the hard way. (i.e., without any tools like Kops or Kubeadm) in order to learn deeper concepts of its architecture.
    Technologies: Amazon Web Services (AWS), PagerDuty, MongoDB, VyOS, Kubernetes, Helm, Jenkins, Elasticsearch, Datadog, Kafka, ZooKeeper, MirrorMaker, Burrow
  • DevOps Engineer

    2017 - 2017
    MAV Technology
    • Centralized in an HAProxy cluster all incoming requests which didn’t have a proper entry point for the infrastructure (i.e., DNS pointed to lots of different entry points)—thus avoiding single points of failure.
    • Fixed multiple bugs in Node.js servers, among them a critical one which forced us to restart production containers from time to time because of a progressive decay of performance.
    • Solved multiple bugs in Objective-C servers by creating a system for debugging multiple servers in real-time, attaching multiple GDBs to multiple processes distributed amongst nodes and capturing eventual stack traces—allowing us to quickly fix bugs that would only occur in the production environment.
    • Developed a Node.js server which would hold thousands of connections open as a fronting proxy for a legacy server which wasn’t able to receive too many simultaneous connections.
    • Stopped an ongoing brute-force password attack, which I was able to detect because of an expressive increase in the number of failed authentications in DataDog. I was able to stop the attack by blocking the attacker’s IP addresses in HAProxy.
    • Resolved a serious problem which would cause Ceph to crash. We traced the problem to a bug that was tied to the specific version of the software we were using.
    Technologies: BareMetal, Node.js, HAProxy, Consul, Datadog, MySQL, MongoDB, Ceph
  • Software Engineering Intern

    2015 - 2016
    Synopsys, Inc.
    • Developed a tool in Python for automatically generating C++ code which would bind hardware transactors written in C++ to TCL.
    • Built a tool for extracting statistics from a hardware-emulating platform and generating D3.js charts.
    • Fixed a major C++ bug caused by a racing condition between GTK and a hardware transactor.
    • Worked for a month at Synopsys' headquarters in Mountain View where I learned a lot about electronic design automation.
    Technologies: Verilog, C++, Python, TCL, D3.js, EDA
  • Junior Back-end Engineer

    2012 - 2014
    MAV Technology
    • Developed a substantial part of a back-end of a corporate email service; it was written in C++ with language bindings to Lua. I utilized MongoDB for storing the email metadata, GridFS for storing their bodies, and MySQL for storing relational user data. Worked with REST interfaces in a monolithic architecture.
    • Built-up part of their front end, written in Java and Google Web Toolkit.
    • Constructed IMAP and POP3 proxies which would route new users coming from other email service providers to their old servers, while capturing their password and transparently migrating their accounts to our servers.
    • Developed HTTP and SMTP servers from scratch with C++.
    • Supported the development of the company’s ERP system; built with CakePHP and Bootstrap.
    Technologies: C++, Lua, MongoDB, MySQL, Java, GWT, CakePHP, Bootstrap


  • Navalis (Development)

    Navalis is a platform which enables developers to deploy and visualize applications in Kubernetes with ease. It also checks the cluster for inconsistencies and constantly monitors its health. It consists of an API written in Go and a front-end written in Vue.js.


  • Tools

    Terraform, Jenkins, Chef, Nginx, Grafana, Kong, Fluentd, Apache ZooKeeper, MirrorMaker
  • Paradigms

    Continuous Integration (CI), Continuous Delivery (CD), DevOps, Scrum, Design Patterns
  • Platforms

    Kubernetes, Apache Kafka, Linux, PagerDuty, Amazon Web Services (AWS), Google Cloud Platform (GCP)
  • Storage

    Datadog, Elasticsearch, MongoDB, MySQL, PostgreSQL, Redis, Ceph
  • Other

    GoCD, Distributed Computing, AWS DevOps, Distributed Tracing, HAProxy, Prometheus, APM, Consul, VyOS
  • Languages

    Java 8, Go, JavaScript, C++, Python, Transaction Control Language (TCL)
  • Frameworks

    Qt 5, Flask, Express.js, Spring
  • Libraries/APIs

    Node.js, POCO C++, Vue.js, D3.js


  • Bachelor of Science degree in Computer Science
    2011 - 2017
    Federal University of Minas Gerais - Belo Horizonte, Minas Gerais, Brazil

To view more profiles

Join Toptal
I really like this profile
Share it with others