Site Reliability Engineer
2021 - 2022Anonymous Client- Created Terraform modules for managing AWS platform components in code.
- Configured access control and identity management via Okta and Teleport.
- Empowered developers to understand the underlying platform and make improvements by documenting and participating in knowledge-sharing sessions.
Technologies: Amazon Web Services (AWS), Kubernetes, Kubernetes Operations (Kops), Terraform, Docker, Ansible, Buildkite, GitHub, Prometheus, Grafana, Argo CDDevOps Engineer
2019 - 2021Relayr- Set up metrics monitoring across all environments that involve AWS ECS and AKS. Modularized code and scaled the entire monitoring setup from one single host for the whole org to one host per environment.
- Enabled development and QA teams to rapidly deploy their changes to production by developing tools and methodologies that help scale and maintain services.
- Implemented strategies to build base Docker images for organization-wide adoption across all microservices.
- Resolved production incidents while being on call.
Technologies: Amazon Web Services (AWS), Azure, Kubernetes, Helm, Terraform, Docker, Jenkins, Consul, Vault, Python, Go, Ansible, Prometheus, Grafana, Instana, Argo CD, GitHubDevOps Engineer
2018 - 2019Wayfair- Developed and contributed to tools that serve our purpose for notifications and paging. Deployed code to production using the custom integrator tool.
- Fixed and managed issues of all sizes, from major outages to minor alerts. Communicated and coordinated severe incidents and ensured their quick resolution.
- Created dashboards and tweaked alerts to improve our monitoring at scale. Refined the troubleshooting documentation.
Technologies: Kubernetes, VMware vSphere, Helm, ServiceNow, Datadog, Grafana, Prometheus, NGINX, Python, React, GitLab, Ubuntu, JenkinsDevOps Engineer
2017 - 2018Bahnhof AB- Planned and architected environments as per customer needs by attending meetings and gathering requirements.
- Developed infrastructure automation and configuration using Ansible and implemented CI for the same.
- Maintained and managed monitoring and logging platforms run on Checkmk and Graylog.
Technologies: VMware, Kubernetes, Docker, Ansible, Terraform, Checkmk, Apache, NGINX, MySQL, PHP, Bash, Python, React, Ubuntu, OpenStack