Joseph Rothrock, Data Engineer and Developer in San Francisco, CA, United States
Joseph Rothrock

Data Engineer and Developer in San Francisco, CA, United States

Member since April 28, 2020
Joseph designs and builds database systems as well as at the operational infrastructure to run them. He's led data engineering teams and has expertise in OLTP, data warehousing, and globally distributed NoSQL systems. Along the way, Joseph has made meaningful open-source contributions and developed a GitHub portfolio of database reference work.
Joseph is now available for hire

Portfolio

  • Hover
    Databases, Data Warehousing, SQL Performance, Data Engineering, Docker...
  • Endgame
    Bash, SQL Performance, Continuous Integration (CI), Python, Groovy, Databases...
  • PayPal
    Databases, SQL Performance, Kubernetes, REST, SQL, Bash, Go

Experience

Location

San Francisco, CA, United States

Availability

Part-time

Preferred Environment

Data Warehousing, Data Engineering, ETL, SQL, PostgreSQL, Bash, Python

The most amazing...

...project I've worked on won an Emmy in 2005 for our engineering effort bringing live TV to mobile phones.

Employment

  • Data Engineer

    2019 - 2020
    Hover
    • Migrated the codebase to a container-first, CI/CD system for hands-off automated testing and deployment using Docker, Python, Bash, and Codefresh.
    • Built various CLI tools and libraries to manipulate segment.io plans, GitHub repositories, and Tableau resources. Written with Bash, SQL, and Python.
    • Ran Airflow ETL hosted in Astronomer Cloud and coded the ETL in Python, SQL.
    • Provided mentoring and training to team members on SQL performance tuning, collaborative coding, and docker container development.
    • Advised the director and VP-level staff and helped drive cultural advance in the team's approach to collaborative development.
    Technologies: Databases, Data Warehousing, SQL Performance, Data Engineering, Docker, PostgreSQL, Apache Airflow, Bash, SQL, Python
  • Test Automation Developer

    2018 - 2019
    Endgame
    • Built concurrency into automated tests and CI systems. Mostly with Groovy and Python.
    • Gave technical talks on TCP/IP networking, SQL, and database systems.
    • Identified and fixed performance bottlenecks in Linux VMs.
    • Coded features and bug fixes for a custom Python test integration system.
    Technologies: Bash, SQL Performance, Continuous Integration (CI), Python, Groovy, Databases, SQL, TCP/IP, Networking
  • Software Engineer

    2016 - 2017
    PayPal
    • Coded cmd-line utilities, RESTful APIs, and client apps using Bash, Go, and SQL.
    • Contributed enhancements to the Moby project. See Github.com/moby/moby/pull/27565.
    • Wrote a software gateway in Go between Kubernetes and legacy network hardware.
    • Advised directors and VPs on approaches to distributed database systems.
    Technologies: Databases, SQL Performance, Kubernetes, REST, SQL, Bash, Go
  • Data Engineering Manager

    2012 - 2015
    Lookout
    • Provided guidance, feedback, and leadership to a data engineering team. Set the tone for technical direction with engineering and analytics teams.
    • Built Lookout's Hadoop cluster running Hive, Impala, HBase, and Map/Reduce jobs.
    • Used my DBA, coding, and Unix sysadmin skills to improve performance, preserve data integrity, and maintain system availability.
    • Coded ETL and automation tools using Go, C, Bash, and Python.
    • Managed MySQL databases used for analytics, warehousing, and reporting.
    Technologies: Databases, Data Warehousing, SQL Performance, SQL, Data Engineering, C, Bash, MySQL, Jenkins, ETL, Hadoop, Go, Python
  • Software Engineer in Testing

    2010 - 2012
    Cloudmark
    • Tested various C/C++ and Perl daemons that comprise both Cloudmark's anti-spam accuracy infrastructure and its enterprise MTA product.
    • Searched, examined, and evaluated code for defects.
    • Wrote code using C, Perl, Bash, and SQL to defend against regression, demonstrate correctness, and highlight defects.
    • Assisted with Oracle administration and MySQL debugging.
    Technologies: Databases, SQL Performance, SQL, Bash, Perl, C, C++

Experience

  • Pipefish (Development)
    https://github.com/rothrock/pipefish

    Have you ever wanted to send your MySQL query results directly to a file in HDFS? Well, now you can! Pipefish sends the result of your SQL statement to a tab-delimited file in HDFS. It doesn't create any intermediate or temporary files. It just reads the rows from the MySQL server and writes them as tab-delimited fields to a file in HDFS that you specify. At the end of the row-set, Pipefish flushes and closes the HDFS file and closes the MySQL connection.

  • Roxanne (Development)
    https://github.com/rothrock/Roxanne

    Roxanne is a very simple database server that allows a client to store and retrieve values by key. Keys are stored in a hash map of 64,000 buckets. Hash collisions are resolved by separate chaining onto linked lists at the end of the index file.

Skills

  • Languages

    SQL, Python, Bash, Python 3, Go, Perl, Groovy
  • Paradigms

    ETL, Continuous Delivery (CD), Continuous Integration (CI), REST
  • Platforms

    Unix, Kubernetes, Docker, Amazon Web Services (AWS)
  • Storage

    Databases, SQL Performance, PostgreSQL, MySQL
  • Other

    Data Warehousing, Data Warehouse Design, APIs, Data Architecture, Data Engineering, Tableau Server, AWS, Networking, TCP/IP
  • Frameworks

    Hadoop
  • Tools

    Jenkins, Apache Airflow

Education

  • Bachelor of Business degree (magna cum laude) in Computer Information Systems
    1994 - 1996
    Georgia State University - Atlanta, GA, United States

To view more profiles

Join Toptal
Share it with others