Hugo De Oliveira, Data Scientist and Developer in Hamburg, Germany
Hugo De Oliveira

Data Scientist and Developer in Hamburg, Germany

Member since June 15, 2021
Hugo is a data scientist passionate about the opportunities offered by data and AI/ML methods for the healthcare field. Besides his strong scientific education, his business experience gives him hands-on skills in implementing data-oriented solutions. Hugo's research background provides him with autonomy, scientific curiosity, and creativity in the development of theoretical and practical solutions to complex problems.
Hugo is now available for hire


  • HEVA
    Visual Studio Code, GitLab, Health, TensorFlow, Python, Data Visualization...
  • HEVA
    Python, Scikit-learn, Pandas, SQL, Predictive Modeling, Predictive Analytics...
  • Polytechnique Montréal
    VB.NET, RStudio, Data Analysis, Data Analytics, Analytics



Hamburg, Germany



Preferred Environment

Visual Studio Code, Jupyter Notebook, Git, Python

The most amazing...

...opportunity I've had was working on a French national health database and developing innovative predictive modeling methods for patient pathways.


  • Data Scientist

    2017 - 2020
    • Conducted health data analysis studies for public institutions, pharmaceutical, and medical device companies.
    • Collaborated with data scientists, data engineers, developers, UI/UX designers, and medical experts.
    • Participated in a range of research and development projects, from theoretical ideas to implementations on case studies—leading to scientific and technical contributions presented in international conferences or published in peer-reviewed journals.
    Technologies: Visual Studio Code, GitLab, Health, TensorFlow, Python, Data Visualization, Predictive Modeling, Predictive Analytics, Jupyter Notebook, Git, Scikit-learn, NumPy, Pandas, SQL, Plotly, Machine Learning, Data Science, Deep Learning, Data Analysis, Data Analytics, Analytics, Dashboards
  • Intern

    2017 - 2017
    • Benchmarked machine learning algorithms for the classification of French national hospital data.
    • Applied global optimization algorithms to solve a hyperparameter-tuning problem.
    • Conducted health data analysis studies for pharmaceutical and medical device companies.
    Technologies: Python, Scikit-learn, Pandas, SQL, Predictive Modeling, Predictive Analytics, Data Visualization, Jupyter Notebook, Machine Learning, Data Science, Data Analysis, Data Analytics, Analytics
  • Research Intern

    2016 - 2016
    Polytechnique Montréal
    • Analyzed data and extracted knowledge to improve the workload distribution for the Home Care Regional Services of Montreal Island.
    • Created a database in SQL in order to structure caregivers and visit data.
    • Designed and adapted a dashboard to facilitate future data collection.
    Technologies: VB.NET, RStudio, Data Analysis, Data Analytics, Analytics


  • Automatic and Explainable Labeling of Medical Event Logs with Auto-encoding

    Process mining is a suitable method for knowledge extraction from patient pathways. Structured in event logs, medical events are complex, often described using various medical codes. Finding an efficient method of labeling these events before applying process mining analysis was challenging.

    This project focused on developing an innovative methodology to handle the complexity of events in medical event logs. Based on auto-encoding, accurate labels are created by clustering similar events in latent space. Moreover, the explanation of created labels is provided by the decoding of the corresponding events.

  • Meta-TAK: A Scalable Double-clustering Method for Treatment Sequence Visualization

    This project focuses on the study of treatment sequences, particularly the extraction of patterns from nonclinical claim databases through clustering. For this purpose, the TAK algorithm was proposed and demonstrated its usefulness. However, the scalability of the TAK algorithm regarding the number of patients was an issue; the method was impossible to use in practice for thousands of patients. For this purpose, we developed an extension of the TAK algorithm. Referred to as Meta-TAK, this method appears to be robust and computationally efficient.

  • Optimal Process Mining of Timed Event Logs

    This project focuses on solving the problem of determining the optimal process model of an event log of traces of events with temporal information. We introduced a new formalism, along with a Tabu search algorithm to determine the optimal process model that maximizes the traces' representation subject to the constraints of the maximal number of nodes and arcs. We then conducted a healthcare case study to demonstrate the applicability of the approach for clinical pathway modeling. Special attention was paid to readability, so those final users could interpret the process mining results.

  • Binary Classification from French Hospital Data

    In this project, a benchmark of seven machine learning algorithms was performed on binary classification tasks of hospital data. We then tested seven algorithms on three data sets extracted from the French national hospital database. Lastly, we used an efficient global optimization algorithm to solve the hyperparameter tuning problem.

  • Optimal Pathway Discovery Analysis of Sepsis Hospital Admissions Using the HES Database in England

    The “Bow-tie” optimal pathway discovery analysis uses large clinical event datasets to map clinical pathways and to visualize risks (improvement opportunities) before and outcomes, after a specific clinical event. This proof-of-concept study assesses the use of NHS Hospital Episode Statistics (HES) in England as a potential clinical event dataset for this pathway discovery analysis approach.


  • Paradigms

    Data Science
  • Other

    Machine Learning, Deep Learning, Process Mining, Health, Data Visualization, Predictive Analytics, Predictive Modeling, Data Analysis, Data Analytics, Analytics, Dashboards, Optimization, Operations Research, Machine Learning Operations (MLOps), Explainable Artificial Intelligence (XAI), Clustering, Oncology & Cancer Treatment, Hyperparameters
  • Languages

    Python, SQL, VB.NET
  • Libraries/APIs

    NumPy, Pandas, Scikit-learn, TensorFlow
  • Tools

    Plotly, Git, GitLab
  • Platforms

    Visual Studio Code, Jupyter Notebook, RStudio


  • Ph.D. in Engineering
    2017 - 2020
    Mines Saint-Etienne - Saint-Etienne, France
  • Master's Degree in Engineering
    2014 - 2017
    Mines Saint-Etienne - Saint-Etienne, France


  • Machine Learning Modeling Pipelines in Production
    JULY 2021 - PRESENT
  • Machine Learning Data Lifecycle in Production
    JUNE 2021 - PRESENT
  • Introduction to Machine Learning in Production
    MAY 2021 - PRESENT

To view more profiles

Join Toptal
Share it with others