
Hugo De Oliveira
Verified Expert in Engineering
Data Scientist and Developer
Hugo is a full-stack data scientist. Besides his strong scientific education, his business experience gives him hands-on skills in data engineering, analytics, and predictive modeling. Hugo's research background provides him with autonomy, scientific curiosity, and creativity in the development of theoretical and practical solutions to complex problems.
Portfolio
Experience
Availability
Preferred Environment
Visual Studio Code (VS Code), Jupyter Notebook, Git, Python, Redshift, SQL, Data Build Tool (dbt), Google Sheets
The most amazing...
...opportunity I've had was working on a French national health database and developing innovative predictive modeling methods for patient pathways.
Work Experience
Senior Data Scientist
Synthesis School, Inc
- Provided metrics to the different departments within the company (Product, Operations, Marketing, Finance).
- Built and maintained company analytics pipeline, from data engineering to reporting.
- Used Python to create ETL scripts for different data sources, dbt for data modeling, Apache Airflow for orchestration, Redshift for data warehousing, Google Sheets, and Mode for dashboards and reporting.
- Built a Slack notification system to send daily and weekly notifications, informing about acquisition and product metrics.
- Created a heuristic to automatically propose planning of new classes to open every month based on waitlisted student time preferences and teacher availabilities.
- Proposed a Python script to optimize game infrastructure scaling based on scheduled sessions to reduce the number of allocated servers not in operation while ensuring capacity for all sessions.
- Developed a proof of concept (POC) for student progress metrics targeted for parents, including data on interactions with teammates from different locations, game results, and session participation.
- Created a financial dashboard for company executives, including company data and financial reports extracted from the QuickBooks API via a Python script (revenue, expenses, gross margins, cash available, burn, and runway).
- Assisted in the transition to a flat rate system for teacher payment, automating the process of hour tracking, thus saving time for teachers and HR while controlling company costs.
- Created and maintained a budget for a company product as a bi-weekly P&L sheet, reviewed by the team every month to control expenses.
Data Scientist
HEVA
- Conducted health data analysis studies for public institutions, pharmaceutical, and medical device companies.
- Collaborated with data scientists, data engineers, developers, UI/UX designers, and medical experts.
- Participated in a range of research and development projects, from theoretical ideas to implementations of case studies, leading to scientific and technical contributions presented at international conferences or published in peer-reviewed journals.
Research Intern
Polytechnique Montréal
- Analyzed data and extracted knowledge to improve the workload distribution for the Home Care Regional Services of Montreal Island.
- Created a database in SQL in order to structure caregivers and visit data.
- Designed and adapted a dashboard to facilitate future data collection.
Experience
Automatic and Explainable Labeling of Medical Event Logs with Auto-encoding
This project focused on developing an innovative methodology to handle the complexity of events in medical event logs. Based on auto-encoding, accurate labels are created by clustering similar events in latent space. Moreover, the explanation of created labels is provided by the decoding of the corresponding events.
Meta-TAK: A Scalable Double-clustering Method for Treatment Sequence Visualization
Optimal Process Mining of Timed Event Logs
Binary Classification from French Hospital Data
Optimal Pathway Discovery Analysis of Sepsis Hospital Admissions Using the HES Database in England
https://academic.oup.com/jamiaopen/article/3/3/439/5979570Explaining Predictive Factors in Patient Pathways Using Autoencoders
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0277135Skills
Paradigms
Data Science, ETL
Other
Machine Learning, Deep Learning, Process Mining, Health, Data Visualization, Predictive Modeling, Data Analysis, Data Analytics, Analytics, Dashboards, Data Build Tool (dbt), Metrics, Predictive Analytics, Data Modeling, Reporting, Optimization, Operations Research, Machine Learning Operations (MLOps), Explainable Artificial Intelligence (XAI), Clustering, Hyperparameters, Data Analytics (Marketing), Segment, Artificial Intelligence (AI), Healthcare & Insurance, Data Engineering, Education
Languages
SQL, Python, R
Libraries/APIs
NumPy, Pandas, Scikit-learn, TensorFlow
Tools
Plotly, Google Sheets, Git, GitLab, Apache Airflow
Platforms
Visual Studio Code (VS Code), Jupyter Notebook, RStudio
Storage
Redshift, Amazon S3 (AWS S3), MySQL
Education
Ph.D. in Engineering
Mines Saint-Etienne - Saint-Etienne, France
Master's Degree in Engineering
Mines Saint-Etienne - Saint-Etienne, France