Camille Girabawe, Statistics Developer in Santa Clara, CA, United States
Camille Girabawe

Statistics Developer in Santa Clara, CA, United States

Member since July 27, 2019
Camille is a data-driven thinker and strategies with a Ph.D. in Physics from Brandeis University and professional experience in data science in the B2B and B2C world. He has data science, software development, and business skills to design, implement, and deploy on-premise or on-cloud AI solutions for business problems. Camille enjoys applying machine learning to different areas to solve real life challenges.
Camille is now available for hire


  • Adobe
    Machine Learning, Deep Learning, Statistics, Python, SQL, GCP, Airflow...
  • SAP Labs
    Machine Learning, Deep Learning, Statistics, Python, SQL, JavaScript, GCP...



Santa Clara, CA, United States



Preferred Environment

MacOS, Linux, Atom, Git, HanaStudio

The most amazing...

...thing I've designed and implemented was an AI-driven app in S4HANA Procurement suite to propose materials for contract negotiation based on historical spending.


  • Senior Data Scientist

    2019 - PRESENT
    • Developed AI-driven filters to help marketers to extend their audience using historical and real-time data of campaigns' success and failure. Results are a lift of up 25% on the audience and a boost of about 7% on the success rate.
    • Created a machine learning pipeline to score leads behavior at different stages of their marketing journey [project currently under development].
    • Automated orchestration of multiple dockers and dataproc at different stages of the machine learning development cycle using Airflow DAGs.
    Technologies: Machine Learning, Deep Learning, Statistics, Python, SQL, GCP, Airflow, Newrelic, InDesgin
  • Data Scientist

    2017 - 2019
    SAP Labs
    • Developed real-time monitoring of procurement spendings to propose materials for (re)negotiated contracts. Procurement strategic purchasers can be able to reduce the processing time from an average of 2 months to 1 day.
    • Assigned a risk score to each purchase requisition in order to automatically approve it based on SAP WorkFlow data. Reduced the approval time-interval to seconds and improved consistency in approve.
    • Reduced the processing time, improved the consistency and set up a new platform for fraud detection for AI-driven matching of an invoice to an account to be charged for an invoice without purchase orders.
    Technologies: Machine Learning, Deep Learning, Statistics, Python, SQL, JavaScript, GCP, InDesign, TensorFlow


  • Programmable Illumination Microscope (PIM) Controler (Development)

    Python-based app to control a multipoint focused microscope to run a light-sensitive experiment. Given a sample of light-sensitive and optically oscillatory solution compartmentalized on a 2D grid, the goal was to focus light on selected cells in order to excite or inhibit them such that the entire grid would be trained in unison (just like fireflies) or any other given structure.

    A combination of deterministic and machine learning models was implemented in Python to train a model that would learn the temporal oscillations of the chemical solution and determine which cells to inhibit/excite by exposing them to light.

    This was part of my dissertation:

  • Predicting Green Taxi Tips (Development)

    The goal of the project was to build a model that can predict the amount of tip a driver of a Green Taxi would receive at the end of his/her ride in NYC.

    Data were obtained from the TLC Trip Record Data. After a deep analysis of features for statistical significance, two random forest models were optimized and combined to predict the tip with an MSE of about 14. Several features were revealed to be very significant such as whether a rider pays with cash or electronically, trip duration and speed which would give an idea of traffic congestion.

  • Scoring Model for a Toptal Client (Development)

    Built a machine learning model to score participants of classes for a Toptal client. A model was built using multivariate linear regression algorithms. Since the client expects to gain a larger audience, the models were regularized to overcome any source of overfitting.
    Tech Stack: Python, MongoDB, Node.js.


  • Languages

    Python, SQL, R, JavaScript
  • Libraries/APIs

    Pandas, SciPy, Scikit-learn, TensorFlow, Keras, Dask, Selenium WebDriver
  • Other

    Machine Learning, Mathematical Modeling, Physics, Linear Algebra, Statistics, Deep Learning, Software Development, Data Visualization, Natural Language Processing (NLP), Webcrawling
  • Paradigms

    Data Science, Automated Testing
  • Storage

    MySQL, MongoDB
  • Frameworks

  • Tools

    InDesign CC, BigQuery, MongoDB Shell
  • Platforms

    Google Cloud Platform (GCP), Linux, Unix


  • Ph.D. in Physics
    2011 - 2017
    Brandeis University - Waltham, MA
  • Computational Investing - Credential ID PPQHXX8CRWV7
    JULY 2016 - PRESENT

To view more profiles

Join Toptal
Share it with others