
Enrico Z. Borba
Verified Expert in Engineering
Back-end Developer
Enrico is a computer scientist and engineer with 8+ years of experience at companies such as FOSSA, Google, Facebook, and Mitsubishi. He graduated from Caltech, specializing in machine learning, but he worked in full-stack roles with software such as React, PostgreSQL, Kubernetes, Go, and Python. Enrico has a strong mathematics background and is looking for challenging and impactful projects to work on.
Portfolio
Experience
Availability
Preferred Environment
Linux, NixOS
The most amazing...
...thing I've built was a data tool so valuable that it was patented, a paper was published about it, and an engineer was hired to maintain it.
Work Experience
Software Engineer
FOSSA
- Led the development of a back-end and front-end redesign. Designed a new schema to migrate billions of rows without downtime, reducing query time by 120 times. Supported the old and new schema simultaneously, enabling concurrent feature development.
- Led the development of a new repository-scanning product that scans huge repositories of around 20 million files after archive expansion and determines license compliance with operational support system (OSS) and non-OSS components.
- Improved the initial implementation of filtered searches on the UI from around 500,000 files in about two minutes to over 15 million files in less than one second.
- Ran weekly meetings with two of the largest customers presenting new features and POCs, incorporated their feedback, and prioritized and scheduled new work and bug fixes with Jira and Pivotal.
- Managed and maintained two on-premise customer databases in PostgreSQL and deployments in Kubernetes. Handled migrations and resource increases.
Software Engineer
- Contributed to the interactive questions team as a subteam of search. Improved the freshness of data using frequency analysis and blacklists.
- Created a pipeline monitor using Apache Flume and Python to gather aggregate statistics, compare pipeline outputs, and report failures.
- Added a sizeable caching system to exploit the data similarities between pipeline runs and drastically reduced the runtime from around 60 hours to under 20 hours.
- Sped up some individual portions of the production pipeline up to 10x, which greatly improved developer iteration since my team could run multiple jobs in sequence and not have to wait overnight for results.
Research Engineer
The Van Valen Lab at Caltech
- Used machine learning (CNNs and NNs) to perform segmentation and cell tracking on movies of biological cells. Improved the cell tracking model accuracy greatly from 70% to 95% by improving data augmentation techniques.
- Created a now-patented GUI data curation tool to allow for human-in-the-loop correction of incorrect predictions by the tracking model. It allowed for a much quicker iteration of labeling large data sets.
- Co-authored two papers describing and demonstrating the infrastructure around the superior cell tracking algorithm. Both of these papers have been cited over ten times combined.
Software Engineering Intern
Mitsubishi Electric Company
- Created the infrastructure for sensor data collection and processing inside next-generation vehicles by working on the systems division.
- Constructed a model to detect drowsiness in drivers using biometric sensors, cameras, and steering wheel data. Used simple machine learning models to track eye movements and predict drowsiness.
- Built a dashboard using Python and Bokeh, showing all collected data and real-time predictions.
- Interfaced with the Texas Instruments mmWave AWR1642 sensor to determine a driver's heart rate without invasive attachments.
Software Engineering Intern
- Contributed to Unicorn, a search as a service team, allowing internal teams to index and search their data. A paper has been published, and it can be read at: research.facebook.com/publications/unicorn-a-system-for-searching-the-social-graph/.
- Developed a self-deployable slim version of DevX, our search product, so other teams could test out their full integrations without blocking the rest of the team. Previously deploying a search vertical would take around two weeks for both teams.
- Unblocked several other interns who wanted to use our search product but were in the backlog because of the slow spin-up time for deployments, allowing them to determine if Unicorn was a good fit for them without interrupting our team.
Experience
FOSSA Issues Version 2 API and Database Schema
https://docs.fossa.com/docs/issues-api-configuration• Scan your code.
• Review issues.
• Resolve them.
• Re-scan your code.
• Repeat.
When I joined FOSSA, large code bases would result in tens of thousands of issues being reported. The original database representation of issues led to extremely slow and complex queries being required for even the most simple of pages on the app. For example, downloading all of its issues across its code bases would take hours for a large organization, even though the resulting CSV file generated would be only around 200MB.
I designed a new API and database schema for issues leading to:
• Simpler SQL queries, leading to better developer experience
• Simpler API, leading to better developer and customer experience
• Faster queries
I designed the schema and API so that both the old and the new schema and API could be used simultaneously to allow for gradual roll-out. Additionally, the migration I designed was completed with zero downtime, and in a matter of hours, despite migrating over 10 billion rows.
FOSSA Issues UI Revamp
https://docs.fossa.com/docs/issues-ui-whats-newWhen I joined FOSSA, large code bases would result in tens of thousands of issues being reported. This resulted in a poor user experience as loading issues were slow, searching was brittle, and filtering did not exist.
I led a team of three engineers and a designer to rework the UI to allow for rich searches, filtering by a variety of fields, and clear triaging of issues. This was the most successful project at FOSSA in that:
• There was no scope reduction.
• It was delivered on time within six months.
• It was executed so well that it could be rolled out smoothly.
The UI rework that I oversaw and the back-end changes that I designed and implemented led to a speedup of 120 times on issue page loads for the largest organizations.
reMarkable Templates
http://rm.ezb.ioThe design is inspired by the operating system's interface on the reMarkable. The front end is built with React and TypeScript, and the back end is a simple use of FastAPI on Python, which interfaces with Google Storage, and is deployed on Heroku.
Intuitive
https://docs.rs/intuitive/Python Chemical Reaction Network Simulator
https://github.com/enricozb/python-crnThis package takes stoichiometric equations as input along with reaction rates and initial concentrations and outputs final concentrations along with a plot of concentration of all the species over time. This package uses Python 3 operator overloading to create a pseudo-DSL to resemble stoichiometric equations for more readability.
DeepCell Label
https://github.com/vanvalenlab/deepcell-labelCell division tracking is an intensive process of manually reviewing frame-by-frame microscope video data of cells and annotating the lineage of the cells across frames. At the Van Valen Lab, we trained a computer vision model to segment and track cells without the need for a human. However, during the initial training of the model, it would often incorrectly track cells. The tracked cell data was initially represented as a mixture of TIFFs and CSVs, with the TIFFs containing the segmentation information and the CSVs containing the lineage information. This representation made is very time-consuming and unintuitive to fix incorrect tracking data.
I wrote a tool to let scientists visually correct tracking data by overlaying colors onto the cell video, clearly identifying the segments and lineages. This simplified and sped up the process of repairing incorrect data, allowing the team to train the models better. The tool was initially written in Python and then ported to the browser using JavaScript.
Skills
Languages
Python 3, JavaScript, Python, Go, Bash, TypeScript, Rust, CSS, SQL, C++, Swift
Libraries/APIs
React, NumPy, TensorFlow, Node.js
Tools
Terminal, Git, Helm
Paradigms
ETL, Data Science
Platforms
Linux, Kubernetes, Amazon Web Services (AWS), Docker, Google Cloud Platform (GCP), Heroku
Storage
PostgreSQL, Databases, Redis, MongoDB, Google Cloud Storage
Other
Data Analysis, Algorithms, NixOS, APIs, Scripting, Data Extraction, Command-line Interface (CLI), Code Scanning, Machine Learning, Operating Systems, Complexity Theory, Computational Biology, Apache Flume, HHVM, Hardware, Dashboard Design, PDF, React-pdf, Computer Vision, Graphics, OpenAI GPT-3 API, ChatGPT
Frameworks
Apache Thrift, Bootstrap, Express.js
Education
Bachelor's Degree in Computer Science
California Institute of Technology - Pasadena, CA, USA