Pablo Mainar Jovaní, Developer in Lausanne, Switzerland
Pablo is available for hire
Hire Pablo

Pablo Mainar Jovaní

Verified Expert  in Engineering

Machine Learning Engineer and Software Developer

Lausanne, Switzerland

Toptal member since July 21, 2023

Bio

Pablo is a machine learning engineer with 5+ years of experience in data analysis, statistics, and machine learning modeling. He has worked on various domains, including computer vision, audio processing, time-series forecasting, and human biosignals such as electroencephalography (EEG). Pablo is also skilled in Python and all data processing libraries such as NumPy, pandas, SciPy, PyTorch, and TensorFlow.

Portfolio

Logitech
Python, PyTorch, TensorFlow, EEG, Computer Vision, Audio Processing...
Technicolor
Python, TensorFlow, OpenCV, Unity, Machine Learning, Deep Learning...

Experience

Availability

Part-time

Preferred Environment

Python, PyTorch, Pandas, Linux, Windows

The most amazing...

...work I've done is related to mental state decoding from EEG, where I translated users' cognitive load by modeling their brain waves.

Work Experience

AI Software Engineer

2018 - PRESENT
Logitech
  • Developed machine learning models to infer the mental states of users based on electrical brain activity sensors or EEG. Mental flow, cognitive load, and attention are some of the neuronal markers that I developed.
  • Designed experiments to collect all kinds of biosignals and voices from subjects, resulting in more than 20 designed experiments with over 200 subjects recorded.
  • Built and trained deep learning models for computer vision applications like improving the performance of a person's detectors using self-supervised learning. Increased accuracy from 70% to up to 95% while reducing the complexity of the model.
  • Patented a hybrid speech recognition system where complex commands are processed in the cloud and simple commands locally in the device. Improved energy efficiency, speed, and privacy by a factor of 2.
  • Created and trained a deep learning model for an objective speech quality metric. This model outperforms other available models while removing the need for a clean reference signal, which facilitates its usage.
  • Supervised and mentored more than 15 interns, including master thesis students.
  • Published my work at top-tier machine learning conferences like the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Interspeech, or IEEE Engineering in Medicine and Biology Society (EMBS) conference.
Technologies: Python, PyTorch, TensorFlow, EEG, Computer Vision, Audio Processing, Machine Learning, Deep Learning, Data Analysis, Statistics, C++, Artificial Intelligence (AI), Visual Studio Code (VS Code), Windows, Linux, MacOS, Kaldi, Software, Software Development, Version Control, EEG Libraries for Python, Signal Processing, Digital Signal Processing, Time Series, Time Series Analysis, Automatic Speech Recognition (ASR), Speech to Text, Electronics, Pandas, Scikit-learn, NumPy, Matplotlib, Coding, Programming, Neural Networks, Deep Neural Networks (DNNs), SciPy, Image Processing, Experimental Design, MATLAB, Technical Project Management, Development, Research, EDA, Technical Leadership, Technical Project Monitoring, Supervisor, Data Science, Modeling, Docker, Data Engineering, Natural Language Processing (NLP)

Deep Learning Intern

2017 - 2017
Technicolor
  • Developed a deep neural network that generated the stereoscopic pair of a given monocular image.
  • Scraped YouTube stereoscopic videos to build a dataset to train the model.
  • Expanded the application to generate stereoscopic 360-degree panoramic images.
  • Built a virtual reality application in Unity to feel the quality and sense the depth of the generated stereoscopic panoramas.
Technologies: Python, TensorFlow, OpenCV, Unity, Machine Learning, Deep Learning, Computer Vision, Artificial Intelligence (AI), Software, C++, Image Processing, NumPy, SciPy, Scikit-learn, Scikit-image, Matplotlib, Neural Networks, Deep Neural Networks (DNNs), Software Development, Coding, Programming, Research, Development, Artificial Neural Networks (ANN), EDA, Data Science, Modeling

Hybrid Voice Command Processing

https://patentimages.storage.googleapis.com/7f/c5/88/804899f929e3bf/US20220406305A1.pdf
A full speech-to-text (STT) hybrid I developed between cloud and on-device processing.

A controller was designed to differentiate between simple voice commands, such as "play the next song," and complex commands, such as "play something from the Stones." Simple commands were processed on-device to improve availability, privacy, and battery life, while complex ones were sent to the cloud for more powerful processing.

I handled the technical development of the on-device STT—designing the model and training it on a limited vocabulary. I was also in charge of the controller deciding how to route the command. This required deep knowledge of speech processing, including the classical and more modern approaches for STT. The STT was developed in C++ for embedding in Raspberry Pi, and most of the experimentation was done in Python. A patent was filed based on this project.

Objective Speech Quality Metric

https://arxiv.org/pdf/2204.01345.pdf
A deep learning-based metric to measure the quality of speech.

Traditionally, quality has been measured by asking a group of experts and averaging their opinions. In this project, we developed an automatic way of doing it based on a deep-learning model. We used crowdsourcing to collect a large dataset that we used—among other publicly available data—to train our model. This model is now in use in the production line of many audio products.

I led the project and supervised the student working on it. This included writing code for training the model and suggesting ideas to improve the performance. The model was written in Python using PyTorch. We participated in a speech quality metric competition at a top-tier scientific conference, and the model ranked among the best. The scientific paper can be found in the attached project URL.

Mental Flow Estimation Through EEG

https://mkegler.github.io/publication/cherep-2022/cherep-2022.pdf
A deep neural model we developed to predict the state of flow based on wearable EEG.

Flow is a mental state where the user is very focused and productive. We developed a deep neural model to predict the state of flow based on sensors placed on the scalp that measured electrical brain activity, known as EEG. This involved a big experimental design considering many confounding variables, data collection, data analysis, and modeling. All the coding was done in Python, including the experimental design—with the synchronization of all sensors—and modeling, which was done using TensorFlow.

I led this project and was responsible for the experimental design and the modeling. I also supervised the student working on the project, providing insights and suggestions on how to proceed. This project was published at the IEEE EMBS conference in 2022. The paper can be found in the attached project URL.
2015 - 2018

Master's Degree in Electrical Engineering

EPFL - Lausanne, Switzerland

2011 - 2015

Bachelor's Degree in Telecommunications Engineering

Universitat Politecnica de Valencia - Valencia, Spain

Libraries/APIs

Pandas, Scikit-learn, Matplotlib, PyTorch, TensorFlow, OpenCV, NumPy, SciPy

Tools

Kaldi, Scikit-image, MATLAB, Hidden Markov Model, Supervisor

Languages

Python, C++, C, Java

Platforms

Linux, Windows, Arduino, Visual Studio Code (VS Code), MacOS, Raspberry Pi, Android, Docker

Frameworks

Unity, Android SDK

Storage

Google Cloud

Other

Machine Learning, Deep Learning, Data Analysis, EEG, Audio Processing, Artificial Intelligence (AI), Software, Signal Processing, Digital Signal Processing, Time Series, Time Series Analysis, Coding, Programming, Neural Networks, Deep Neural Networks (DNNs), Artificial Neural Networks (ANN), Experimental Design, Technical Project Management, Data Processing, Development, Research, EDA, Data Science, Modeling, Data Engineering, Software Development, Statistics, Computer Vision, Image Processing, Microcontrollers, Embedded Systems, Electronics, Automatic Speech Recognition (ASR), Version Control, EEG Libraries for Python, Speech to Text, Hardware, Antenna Design, Data Collection, Technical Leadership, Crowdsourcing, Amazon Mechanical Turk (MTurk), Data Analytics, Technical Project Monitoring, OCR, Natural Language Processing (NLP)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring