Camille Girabawe, Developer in San Jose, CA, United States
Camille is available for hire
Hire Camille

Camille Girabawe

Verified Expert  in Engineering

Statistics Developer

Location
San Jose, CA, United States
Toptal Member Since
November 12, 2019

Camille is a data leader with a PhD in physics and a passion for machine learning and artificial intelligence. He has extensive experience building multi-tenant and multi-cloud solutions for B2B and B2C systems across various domains, such as marketing, finance, procurement, logistics, and operations research. He uses cutting-edge technologies in machine learning, deep learning, and GenAI to solve real-life challenges and deliver value to his clients.

Portfolio

Adobe
New Relic, Apache Airflow, Google Cloud Platform (GCP), SQL, Python, Statistics...
SAP Labs
TensorFlow, Google Cloud Platform (GCP), JavaScript, SQL, Python, Statistics...
Brandeis University
Arduino, MATLAB, Python, NumPy, Data Science, Pandas, Unix, R, Linear Algebra...

Experience

Availability

Part-time

Preferred Environment

Git, Linux, MacOS, Visual Studio Code (VS Code), Docker, SQL, OpenAI GPT-3 API, Azure, Google Cloud Platform (GCP), AWS CLI

The most amazing...

...thing I've built was vehicle dispatch solution with speech-to-text and trip prediction that alerts drivers and dispatchers.

Work Experience

Senior Machine Learning Manager

2019 - PRESENT
Adobe
  • Led the design, implementation, and productization of real-time generative capabilities to empower a marketing chatbot that scales across multiple tenants in different industries.
  • Developed AI-driven filters to help marketers extend their audience using historical and real-time data of campaigns' success and failure. Results are a lift of up to 25% on the audience and a boost of about 7% on the success rate.
  • Led the development of a machine learning solution to optimize the right time to send marketing emails to increase the open rate. Current A/B testing results show a double open rate.
  • Spearheaded AI projects for the conversation marketing platform, from text representation models to language models and natural language understanding, NLP, and computer vision.
Technologies: New Relic, Apache Airflow, Google Cloud Platform (GCP), SQL, Python, Statistics, Deep Learning, Machine Learning, Data Scraping, NumPy, Data Science, AWS CLI, Azure, OpenAI GPT-3 API, OpenAI GPT-4 API, LangChain, Hugging Face, Generative Artificial Intelligence (GenAI), PyTorch, Web Crawlers, Mathematical Modeling, Pandas, MySQL, Flask, Unix, Keras, BigQuery, MongoDB, Linear Algebra, SciPy, Data Engineering, Scikit-learn, Automated Testing, Software Development, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Docker, Artificial Intelligence (AI), REST APIs, Chatbots, Data Analytics, Chatbot

Data Scientist

2017 - 2019
SAP Labs
  • Developed real-time monitoring of procurement expenses to propose materials for renegotiated contracts. Procurement strategic purchasers can reduce the processing time from an average of two months to three days.
  • Developed a machine learning model to assign a risk score to each purchase requisition to automatically approve it based on SAP WorkFlow data. Improved data consistency and reduced the approval time interval to seconds.
  • Developed a machine learning solution for invoice-to-account matching to reduce processing time, improve consistency, and reduce related accounting errors and frauds.
Technologies: TensorFlow, Google Cloud Platform (GCP), JavaScript, SQL, Python, Statistics, Deep Learning, Machine Learning, NumPy, Data Science, Web Crawlers, Mathematical Modeling, Pandas, MySQL, Flask, Unix, Keras, Linear Algebra, SciPy, Data Engineering, Scikit-learn, Automated Testing, Data Visualization, Software Development, Selenium WebDriver, Docker, Artificial Intelligence (AI), REST APIs, Data Analytics

PhD Research Assistant

2013 - 2017
Brandeis University
  • Investigated synchronization in non-linear oscillators using the Belousov-Zhabontisky reaction as an experimental medium. My work involved designing experiments, data collection, analysis, and mathematical modeling.
  • Built a computer vision and mechanically empowered robotic system to control all experiments autonomously. Reactions were generated in droplets and deposited on microlithographic chips.
  • Developed a programmable illumination microscope controller to excite or inhibit droplets using different light colors. This was achieved using computer vision technology to track droplets and their status in real-time.
Technologies: Arduino, MATLAB, Python, NumPy, Data Science, Pandas, Unix, R, Linear Algebra, Physics, Artificial Intelligence (AI), Data Analytics

Programmable Illumination Microscope (PIM) Controller

https://www.youtube.com/watch?v=BsRsiweTfp0
Python-based app to control a multipoint focused microscope to run a light-sensitive experiment. Given a sample of light-sensitive and optically oscillatory solution compartmentalized on a 2D grid, the goal was to focus light on selected cells in order to excite or inhibit them such that the entire grid would be trained in unison (just like fireflies) or any other given structure.

A combination of deterministic and machine learning models was implemented in Python to train a model that would learn the temporal oscillations of the chemical solution and determine which cells to inhibit/excite by exposing them to light.

This was part of my dissertation: https://search.proquest.com/openview/aa8113b66c0fcb2d9a4f97fe7cfc5091/1?pq-origsite=gscholar&cbl=18750&diss=y

Predicting Green Taxi Tips

https://github.com/kthouz/NYC_Green_Taxi
The goal of the project was to build a model that can predict the amount of tip a driver of a Green Taxi would receive at the end of his/her ride in NYC.

Data were obtained from the TLC Trip Record Data. After a deep analysis of features for statistical significance, two random forest models were optimized and combined to predict the tip with an MSE of about 14. Several features were revealed to be very significant such as whether a rider pays with cash or electronically, trip duration, and speed which would give an idea of traffic congestion.

https://camillegirabawe.shinyapps.io/nycgreentaxi/

Scoring Model for a Toptal Client

Built a machine learning model to score participants of classes for a Toptal client. A model was built using multivariate linear regression algorithms. Since the client expects to gain a larger audience, the models were regularized to overcome any source of overfitting.
Tech Stack: Python, MongoDB, Node.js.

EDA Tool

https://www.youtube.com/watch?v=Q62jB0ZFv6M&t=1s
I built an exploratory data analysis (EDA) tool that can be used to visually explore a dataset, run statistics on it, and add comments in real-time which can be saved or printed later. I used Python-Flask in the back end and D3.js on the front end.

PieEye

I built machine learning models to detect and mask personally identifiable information found across databases. The model detected hundreds of PIIs encountered in both structured and unstructured data covering English, French, and Spanish languages and both US, Europe, and Asia geos.

Languages

Python, SQL, R, JavaScript

Libraries/APIs

Pandas, SciPy, NumPy, Scikit-learn, REST APIs, TensorFlow, Keras, Selenium WebDriver, PyTorch, Jenkins Pipeline

Other

Machine Learning, Mathematical Modeling, Physics, Linear Algebra, Statistics, Artificial Intelligence (AI), Data Analytics, Chatbot, Data Engineering, Natural Language Processing (NLP), Data Scraping, Deep Learning, Software Development, Data Visualization, Web Crawlers, GPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, OpenAI GPT-4 API, LangChain, Hugging Face, Generative Artificial Intelligence (GenAI), Natural Language Understanding (NLU), CI/CD Pipelines, Chatbots

Paradigms

Data Science, Automated Testing

Storage

MySQL, MongoDB

Frameworks

Flask

Tools

Git, Apache Airflow, MATLAB, BigQuery, AWS CLI

Platforms

MacOS, New Relic, Arduino, Google Cloud Platform (GCP), Linux, Unix, Visual Studio Code (VS Code), Docker, Azure

2011 - 2017

Ph.D. in Physics

Brandeis University - Waltham, MA

JULY 2016 - PRESENT

Computational Investing - Credential ID PPQHXX8CRWV7

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring