CI Infrastructure Engineer
Toptal is a global network of top freelance talent in business, design, and technology that enables companies to scale their teams, on-demand. With $100+ million in annual revenue and over 40% year-over-year growth, Toptal is the world’s largest fully remote company.
We take the best elements of virtual teams and combine them with a support structure that encourages innovation, social interaction, and fun. We see no borders, move at a fast pace, and are never afraid to break the mold.
We are looking for an experienced engineer to build and scale CI systems in a cloud environment within our CI Infrastructure team. Our CI Infrastructure Engineers work with a high-energy, fast-paced team responsible for supporting initiatives and operations across Toptal. This is a remote position that can be done from anywhere. Due to the remote nature of this role, we are unable to provide visa sponsorship. Resumes and communication must be submitted in English.
Toptal services are deployed across hundreds of servers. In order to make sure we are delivering high quality code frequently and predictably we built a complex CI system to support our development.
You will be responsible for designing, building, deploying, and maintaining CI systems and environments, with shared ownership with the development teams.
We are embracing DevOps practices, where the CI Infrastructure team develops CI systems, automation, tooling, and workflows and has a consulting/mentoring role in enabling developer teams to own the whole lifecycle of the software they are making.
Collaborate regularly with engineering teams to improve the company’s engineering tools, systems, procedures, and data security, not just administer clusters and cloud services.
Join daily scrum standups (GMT-3 to GMT+5). Expect pair programming, engaging in peer code reviews, and using collaboration tools like Slack and Zoom.
In the first week, expect to:
- Begin onboarding into Toptal.
- Learn about our team’s processes and get an overview of the current CI setup.
In the first month, expect to:
- Gain insight into our systems by learning why they are built the way they are and how to improve them.
- Get familiar with monitoring and alerting solutions.
- Begin to learn a variety of roles in a wide range of Infrastructure projects.
In the first three months, expect to:
- Start working on support tasks to get familiar with the core tools and setup and everyday challenges.
- Provide excellent customer service by understanding and addressing the teams’ needs and expectations through effective communication and collaboration while learning about our infrastructure.
- Write CI pipelines on your own.
- Learn how teams are using CI and what are the biggest challenges.
In the first six months, expect to:
- Support and improve CI infrastructure design, architecture, and implementation support.
- Participate in the on-call rotation schedule (during business hours) to support CI related help requests.
- Report any performance issues faced by the CI systems, drill down to find out what caused it, and coordinate with other teams to resolve them.
In the first year, expect to:
- Communicate with key partners on project engagements.
- Partner closely with our teams in the engineering area to develop CI infrastructure automation and management solutions with a strong focus on scalability, observability, automation, reliability, security, and quality.
- Plan and coordinate improvements related to CI infrastructure.
- Participate in technology initiatives that enable developers to deliver their services to our customers with minimal friction and a high degree of quality.
- Postgresql experience.
- Hands-on experience with Jenkins and GitHub Actions.
- Hands-on experience with system and application metric collection and alerting services like Grafana, Prometheus, or others. A keen focus on what makes a system observable.
- Experience with Docker, Docker Compose, and building optimized docker files.
- Experience with Google Cloud platform.
- Experience with Kubernetes environments: production operations, troubleshooting, debugging, cluster provisioning, and management.
- Be proficient in deploying automation with tools like ansible and terraform, as well as version control.
- Be eager to help teammates, share knowledge with them, and learn from them.
- A strong understanding of modern systems and service-related security methodologies.