DevOps Engineer and Developer
Daniel is a former Google site reliability engineer (SRE) and infrastructure software engineer specializing in building and automating scalable, secure SaaS platforms. He has over a decade of experience creating, leading, and growing infrastructure teams. Daniel has charted the technical direction in new and legacy environments with a focus on delivering on business objectives.
ExperiencePython - 13 yearsPrometheus - 3 yearsNomad - 3 yearsGo - 3 yearsDocker - 3 yearsVault - 3 yearsTerraform - 3 yearsConsul - 3 years
Amazon Web Services (AWS), Elasticsearch, CSS3, HTML5, jQuery, Django, PostgreSQL, MySQL, HAProxy, NGINX, Jenkins, GitLab, SaltStack, Prometheus, Nomad, Consul, Vault, Docker, Terraform, Go, Python
The most amazing...
...platform I've built resolved microservice version compatibility issues & empowered developers to push hidden versions to production for testing with customers.
Principal Platform Engineer
- Developed a second-generation SaaS payments platform in service of a growing customer base; included scaling infrastructure, processes, and people.
- Automated deployments to empower customer support and improve software release times by leveraging service mesh and orchestration technologies.
- Enabled multi-region failover-and-disaster recovery (DR) by creating a dynamic traffic management system.
Lead Platform Engineer
- Led the transformation effort of Ripple’s products from enterprise on-premise to cloud-based SaaS applications to increase customer ROI and reliability.
- Improved the SRE team's efficiency with infrastructure visibility and reduced toil with centralized logging, monitoring, intrusion detection, and automated certificate rotation.
- Set the technical direction for a SaaS platform and applications that included technology selection, application development guidelines, and on-call playbooks and training for development teams.
- Led the engineering team to deliver wearable IoT apps (Android and iOS) for Hewlett-Packard working with Kunai Consulting.
- Created the build infrastructure for completely automated application builds for Android and iOS.
- Served as the technical advisor for NewGen Venture Partners, a Silicon Valley venture capital firm.
- Volunteered for EFF and worked on a project to secure email traffic between servers.
- Built a personal event website to publish updates, send bulk email, upload images and give attendees password-less logins.
Site Reliability Engineer
- Designed a new configuration architecture for App Engine clusters worldwide to ease scaling and maintenance.
- Supported Google Cloud Datastore releases and incidents within the 99.95% uptime SLA.
- Refactored legacy service automation (pre-Borg) to assist with its replacement and eventual decommission.
- Built production clusters for testing new hardware to reduce manufacturing costs by making “go/no-go” decisions earlier, reducing costs on a yearly basis by $10+ million.
- Automated assembly line testing to improve manufacturing yields and allow hardware engineers to easily develop manufacturing tests which reduced yearly costs and prevented major manufacturing deadline slips.
- Developed a map-based tool to explore laboratory usage across a department of more than 500 engineers and produce reports for leadership.
- Created a full-spectrum monitoring-and-alerting service to enable incident response for facility, cluster, and network events.
Linux System Administrator
- Scaled a Google campus laboratory network to empower hardware teams across the company while reducing overhead costs.
- Developed soft-EPO (emergency power-off) for power/cooling incidents, MapReduce jobs for compliance, and custom security scanners to enforce policies on insecure networks.
- Drove the scaling effort on the first Android testing laboratory for the release of Android 3.0 “Honeycomb.”.
- Built out hardware testbeds by retrofitting production automated installer for laboratories.
Vault PKI Formulahttps://github.com/ripple/vault-pki-formula
From the original project, I added fetching of dynamic authorization tokens, dead-letter support, save points to recover from rescheduling, and a JSON-only log output format.
Overall this allowed the log follower to be run as a normal Nomad job on every Nomad host—saving every pod from running its own log shipper—while improving log durability, log formatting, and system security.
Rkt, Django, Scrapy
Terraform, Vault, SaltStack, GitLab CI/CD, ELK (Elastic Stack), NGINX, OpenVPN, GitLab, BigQuery, Fastlane, Postfix, Envoy Proxy, Jenkins, Ansible
Consul, Nomad, Prometheus, HAProxy, Pexpect, Borg
Google Closure, jQuery, Google Maps API, OpenLayers
Amazon Web Services (AWS), Docker, Google App Engine
MySQL, Elasticsearch, BigTable, Google Cloud Datastore, MariaDB, PostgreSQL
Bachelor's Degree in Philosophy
University of California, Santa Cruz - Santa Cruz, CA, USA