Joseph Rothrock, Developer in San Francisco, CA, United States
Joseph is available for hire
Hire Joseph

Joseph Rothrock

Verified Expert  in Engineering

Data Engineer and Developer

San Francisco, CA, United States
Toptal Member Since
June 25, 2020

Joseph designs and builds database systems as well as at the operational infrastructure to run them. He's led data engineering teams and has expertise in OLTP, data warehousing, and globally distributed NoSQL systems. Along the way, Joseph has made meaningful open-source contributions and developed a GitHub portfolio of database reference work.


Databases, Data Warehouse Design, Data Warehousing, SQL Performance...
Bash, SQL Performance, Continuous Integration (CI), Python, Groovy, Databases...
Databases, SQL Performance, Kubernetes, REST, SQL, Bash, Go




Preferred Environment

Data Warehouse Design, Data Warehousing, Data Engineering, ETL, SQL, PostgreSQL, Bash, Python

The most amazing...

...project I've worked on won an Emmy in 2005 for our engineering effort bringing live TV to mobile phones.

Work Experience

Data Engineer

2019 - 2020
  • Migrated the codebase to a container-first, CI/CD system for hands-off automated testing and deployment using Docker, Python, Bash, and Codefresh.
  • Built various CLI tools and libraries to manipulate plans, GitHub repositories, and Tableau resources. Written with Bash, SQL, and Python.
  • Ran Airflow ETL hosted in Astronomer Cloud and coded the ETL in Python, SQL.
  • Provided mentoring and training to team members on SQL performance tuning, collaborative coding, and docker container development.
  • Advised the director and VP-level staff and helped drive cultural advance in the team's approach to collaborative development.
Technologies: Databases, Data Warehousing, Data Warehouse Design, SQL Performance, Data Engineering, Docker, PostgreSQL, Apache Airflow, Bash, SQL, Python

Test Automation Developer

2018 - 2019
  • Built concurrency into automated tests and CI systems. Mostly with Groovy and Python.
  • Gave technical talks on TCP/IP networking, SQL, and database systems.
  • Identified and fixed performance bottlenecks in Linux VMs.
  • Coded features and bug fixes for a custom Python test integration system.
Technologies: Bash, SQL Performance, Continuous Integration (CI), Python, Groovy, Databases, SQL, TCP/IP, Networking

Software Engineer

2016 - 2017
  • Coded cmd-line utilities, RESTful APIs, and client apps using Bash, Go, and SQL.
  • Contributed enhancements to the Moby project. See
  • Wrote a software gateway in Go between Kubernetes and legacy network hardware.
  • Advised directors and VPs on approaches to distributed database systems.
Technologies: Databases, SQL Performance, Kubernetes, REST, SQL, Bash, Go

Data Engineering Manager

2012 - 2015
  • Provided guidance, feedback, and leadership to a data engineering team; set the tone for technical direction with engineering and analytics teams.
  • Built Lookout's Hadoop cluster running Hive, Impala, HBase, and Map/Reduce jobs.
  • Used my DBA, coding, and Unix sysadmin skills to improve performance, preserve data integrity, and maintain system availability.
  • Coded ETL and automation tools using Go, C, Bash, and Python.
  • Managed MySQL databases used for analytics, warehousing, and reporting.
Technologies: Databases, Data Warehouse Design, Data Warehousing, SQL Performance, SQL, Data Engineering, Bash, MySQL, Jenkins, ETL, Hadoop, Go, Python

Software Engineer in Testing

2010 - 2012
  • Tested various C/C++ and Perl daemons that comprise both Cloudmark's anti-spam accuracy infrastructure and its enterprise MTA product.
  • Searched, examined, and evaluated code for defects.
  • Wrote code using C, Perl, Bash, and SQL to defend against regression, demonstrate correctness, and highlight defects.
  • Assisted with Oracle administration and MySQL debugging.
Technologies: Databases, SQL Performance, SQL, Bash, Perl

Roxanne is a very simple database server that allows a client to store and retrieve values by key. Keys are stored in a hash map of 64,000 buckets. Hash collisions are resolved by separate chaining onto linked lists at the end of the index file.

Have you ever wanted to send your MySQL query results directly to a file in HDFS? Well, now you can! Pipefish sends the result of your SQL statement to a tab-delimited file in HDFS. It doesn't create any intermediate or temporary files. It just reads the rows from the MySQL server and writes them as tab-delimited fields to a file in HDFS that you specify. At the end of the row-set, Pipefish flushes and closes the HDFS file and closes the MySQL connection.


SQL, Python, Bash, Python 3, Go, Perl, Groovy


ETL, Continuous Delivery (CD), Continuous Integration (CI), REST


Unix, Kubernetes, Docker, Amazon Web Services (AWS)


Databases, SQL Performance, PostgreSQL, MySQL


Data Warehousing, Data Warehouse Design, APIs, Data Architecture, Data Engineering, Tableau Server, Networking, TCP/IP




Jenkins, Apache Airflow

1994 - 1996

Bachelor of Business Degree (Magna Cum Laude) in Computer Information Systems

Georgia State University - Atlanta, GA, United States