Enrico Z. Borba, Developer in New York, NY, United States
Enrico is available for hire
Hire Enrico

Enrico Z. Borba

Verified Expert  in Engineering

Back-end Developer

Location
New York, NY, United States
Toptal Member Since
March 23, 2022

Enrico is a computer scientist and engineer with 8+ years of experience at companies such as FOSSA, Google, Facebook, and Mitsubishi. He graduated from Caltech, specializing in machine learning, but he worked in full-stack roles with software such as React, PostgreSQL, Kubernetes, Go, and Python. Enrico has a strong mathematics background and is looking for challenging and impactful projects to work on.

Portfolio

FOSSA
Go, React, PostgreSQL, Kubernetes, Helm, Amazon Web Services (AWS), Docker...
Google
C++, Apache Flume
The Van Valen Lab at Caltech
Helm, Kubernetes, TensorFlow, Google Cloud Platform (GCP), Machine Learning...

Experience

Availability

Part-time

Preferred Environment

Linux, NixOS

The most amazing...

...thing I've built was a data tool so valuable that it was patented, a paper was published about it, and an engineer was hired to maintain it.

Work Experience

Software Engineer

2021 - 2023
FOSSA
  • Led the development of a back-end and front-end redesign. Designed a new schema to migrate billions of rows without downtime, reducing query time by 120 times. Supported the old and new schema simultaneously, enabling concurrent feature development.
  • Led the development of a new repository-scanning product that scans huge repositories of around 20 million files after archive expansion and determines license compliance with operational support system (OSS) and non-OSS components.
  • Improved the initial implementation of filtered searches on the UI from around 500,000 files in about two minutes to over 15 million files in less than one second.
  • Ran weekly meetings with two of the largest customers presenting new features and POCs, incorporated their feedback, and prioritized and scheduled new work and bug fixes with Jira and Pivotal.
  • Managed and maintained two on-premise customer databases in PostgreSQL and deployments in Kubernetes. Handled migrations and resource increases.
Technologies: Go, React, PostgreSQL, Kubernetes, Helm, Amazon Web Services (AWS), Docker, Redis, Node.js, JavaScript, PDF, Databases, Data Extraction, ETL, Command-line Interface (CLI), Data Science, Code Scanning

Software Engineer

2020 - 2021
Google
  • Contributed to the interactive questions team as a subteam of search. Improved the freshness of data using frequency analysis and blacklists.
  • Created a pipeline monitor using Apache Flume and Python to gather aggregate statistics, compare pipeline outputs, and report failures.
  • Added a sizeable caching system to exploit the data similarities between pipeline runs and drastically reduced the runtime from around 60 hours to under 20 hours.
  • Sped up some individual portions of the production pipeline up to 10x, which greatly improved developer iteration since my team could run multiple jobs in sequence and not have to wait overnight for results.
Technologies: C++, Apache Flume

Research Engineer

2018 - 2019
The Van Valen Lab at Caltech
  • Used machine learning (CNNs and NNs) to perform segmentation and cell tracking on movies of biological cells. Improved the cell tracking model accuracy greatly from 70% to 95% by improving data augmentation techniques.
  • Created a now-patented GUI data curation tool to allow for human-in-the-loop correction of incorrect predictions by the tracking model. It allowed for a much quicker iteration of labeling large data sets.
  • Co-authored two papers describing and demonstrating the infrastructure around the superior cell tracking algorithm. Both of these papers have been cited over ten times combined.
Technologies: Helm, Kubernetes, TensorFlow, Google Cloud Platform (GCP), Machine Learning, Redis, Python 3, MongoDB, Node.js, JavaScript, Amazon Web Services (AWS), Data Extraction, Data Science

Software Engineering Intern

2018 - 2018
Mitsubishi Electric Company
  • Created the infrastructure for sensor data collection and processing inside next-generation vehicles by working on the systems division.
  • Constructed a model to detect drowsiness in drivers using biometric sensors, cameras, and steering wheel data. Used simple machine learning models to track eye movements and predict drowsiness.
  • Built a dashboard using Python and Bokeh, showing all collected data and real-time predictions.
  • Interfaced with the Texas Instruments mmWave AWR1642 sensor to determine a driver's heart rate without invasive attachments.
Technologies: Python 3, Hardware, Dashboard Design, Data Extraction

Software Engineering Intern

2017 - 2017
Facebook
  • Contributed to Unicorn, a search as a service team, allowing internal teams to index and search their data. A paper has been published, and it can be read at: research.facebook.com/publications/unicorn-a-system-for-searching-the-social-graph/.
  • Developed a self-deployable slim version of DevX, our search product, so other teams could test out their full integrations without blocking the rest of the team. Previously deploying a search vertical would take around two weeks for both teams.
  • Unblocked several other interns who wanted to use our search product but were in the backlog because of the slow spin-up time for deployments, allowing them to determine if Unicorn was a good fit for them without interrupting our team.
Technologies: Apache Thrift, C++, HHVM, Python 3, Bash, JavaScript, Data Extraction, Command-line Interface (CLI)

FOSSA Issues Version 2 API and Database Schema

https://docs.fossa.com/docs/issues-api-configuration
A complete redesign of the FOSSA Issues API and database schema, leading to around 120 times speedup on SQL queries. FOSSA is a SaaS that offers legal and security auditing tools for code bases. The workflow is roughly as follows:
• Scan your code.
• Review issues.
• Resolve them.
• Re-scan your code.
• Repeat.

When I joined FOSSA, large code bases would result in tens of thousands of issues being reported. The original database representation of issues led to extremely slow and complex queries being required for even the most simple of pages on the app. For example, downloading all of its issues across its code bases would take hours for a large organization, even though the resulting CSV file generated would be only around 200MB.

I designed a new API and database schema for issues leading to:
• Simpler SQL queries, leading to better developer experience
• Simpler API, leading to better developer and customer experience
• Faster queries

I designed the schema and API so that both the old and the new schema and API could be used simultaneously to allow for gradual roll-out. Additionally, the migration I designed was completed with zero downtime, and in a matter of hours, despite migrating over 10 billion rows.

FOSSA Issues UI Revamp

https://docs.fossa.com/docs/issues-ui-whats-new
A complete redesign of the FOSSA Issues UI and UX with complex user interactions, such as filtering, rich searching, and triaging of issues.

When I joined FOSSA, large code bases would result in tens of thousands of issues being reported. This resulted in a poor user experience as loading issues were slow, searching was brittle, and filtering did not exist.

I led a team of three engineers and a designer to rework the UI to allow for rich searches, filtering by a variety of fields, and clear triaging of issues. This was the most successful project at FOSSA in that:
• There was no scope reduction.
• It was delivered on time within six months.
• It was executed so well that it could be rolled out smoothly.

The UI rework that I oversaw and the back-end changes that I designed and implemented led to a speedup of 120 times on issue page loads for the largest organizations.

reMarkable Templates

http://rm.ezb.io
A website for sharing templates for the reMarkable tablet.

The design is inspired by the operating system's interface on the reMarkable. The front end is built with React and TypeScript, and the back end is a simple use of FastAPI on Python, which interfaces with Google Storage, and is deployed on Heroku.

Intuitive

https://docs.rs/intuitive/
A Rust crate for quickly building component-based text-based user interfaces, i.e., tangible user interfaces (TUIs). React and SwiftUI heavily inspire it, and it contains features that resemble functional components, hooks, and a (mostly) declarative DSL.

Python Chemical Reaction Network Simulator

https://github.com/enricozb/python-crn
A chemical reaction network simulator in Python that is used to facilitate some homework exercises in Caltech's Bi 191ab, a biomolecular computation course.

This package takes stoichiometric equations as input along with reaction rates and initial concentrations and outputs final concentrations along with a plot of concentration of all the species over time. This package uses Python 3 operator overloading to create a pseudo-DSL to resemble stoichiometric equations for more readability.

DeepCell Label

https://github.com/vanvalenlab/deepcell-label
A UI to simplify human-in-the-loop system repairing of segmented cell division video data. The tool that I wrote was eventually patented because of its novelty and ease of use.

Cell division tracking is an intensive process of manually reviewing frame-by-frame microscope video data of cells and annotating the lineage of the cells across frames. At the Van Valen Lab, we trained a computer vision model to segment and track cells without the need for a human. However, during the initial training of the model, it would often incorrectly track cells. The tracked cell data was initially represented as a mixture of TIFFs and CSVs, with the TIFFs containing the segmentation information and the CSVs containing the lineage information. This representation made is very time-consuming and unintuitive to fix incorrect tracking data.

I wrote a tool to let scientists visually correct tracking data by overlaying colors onto the cell video, clearly identifying the segments and lineages. This simplified and sped up the process of repairing incorrect data, allowing the team to train the models better. The tool was initially written in Python and then ported to the browser using JavaScript.
2015 - 2019

Bachelor's Degree in Computer Science

California Institute of Technology - Pasadena, CA, USA

Libraries/APIs

React, NumPy, TensorFlow, Node.js

Tools

Terminal, Git, Helm, ChatGPT

Languages

Python 3, JavaScript, Python, Go, Bash, TypeScript, Rust, CSS, SQL, C++, Swift

Paradigms

ETL, Data Science

Platforms

Linux, Kubernetes, Amazon Web Services (AWS), Docker, Google Cloud Platform (GCP), Heroku

Storage

PostgreSQL, Databases, Redis, MongoDB, Google Cloud Storage

Frameworks

Apache Thrift, Bootstrap, Express.js

Other

Data Analysis, Algorithms, NixOS, APIs, Scripting, Data Extraction, Command-line Interface (CLI), Code Scanning, Machine Learning, Operating Systems, Complexity Theory, Computational Biology, Apache Flume, HHVM, Hardware, Dashboard Design, PDF, React-pdf, Computer Vision, Graphics, OpenAI GPT-3 API

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring